最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

python - Geocoding Keywords Issues (GMaps API) - Stack Overflow

programmeradmin2浏览0评论

I'm developing a geocoding script using Geocoding Google's API, sending address as 'str'.

My workflow is like: I receive a table like: 'street_name', 'house_number', 'state'.

In Chile, there's a keyword to indicate that the place is residential, this is a 'Villa'. Anyways, 'Villa' es like an 'stop-word' it doesn't give any value and can lead to confusion, so in my script I have:

    
    r'_x002d_': '',
    r'/': '',
    r'_xa0_': ' ',
    # La información de dirección que viene después de la Villa obstaculiza la georeferenciación
    r'\bVILLA\b.*': ''

So the problem is when I receive an address like: "Arquimides 123 Villa nueva Lote 3" it returns 'Arquimides 123' because that's what it has to do. The problem I'm facing now is with an address like: "Marga 33, Villa Alemana, Chile" because this is correct addressing but because I made the deletion of 'Villa' and everything coming next now I can't locate any address within 'Villa Alemana'.

I use this function then the regex above to modify the address:

# Función para realizar los reemplazos en una dirección
def reemplazar_palabras(direccion):
    direccion = str(direccion)  # Asegurarse de que sea una cadena
    for patron, reemplazo in reemplazos.items():
        direccion = re.sub(patron, reemplazo, direccion, flags=re.IGNORECASE).strip(",")
    return direccion 

So I'm trying to figure out how to give another perspective to the project because it seems that's gonna be like... dull if I keep adding 'raw' variables.

Is there any model or technique using some kind of modelling to give an scalable solution?

Personally, I think the code was good, but this kind of obstacle represents a high risk because they're valid directions that I'm deleting because my methods.

UPDATE: This is how I 'create' the address:


"""
DIRECCION_F is concatenated address, means Direccion_Final -> 'Direccion' = 'Address'

DIRE_CALLE is meant to only contain the street name, i.e dire_calle: Grecia, dire_numero: 1322, dire_comuna: Santiago
"""

df["DIRECCION_F"]= (
    df["DIRE_CALLE"].astype(str).apply(reemplazar_palabras) + ' ' +
    df["DIRE_NUMERO"].apply(lambda x: f"{int(x)}" if isinstance(x, (int, float)) and x.is_integer() else str(x)).apply(reemplazar_palabras) + ', ' +
    df["DIRE_COMUNA"].astype(str).apply(reemplazar_palabras) + ', ' +
    'CHILE')
发布评论

评论列表(0)

  1. 暂无评论