Technology

Geospatial generative AI with Amazon Bedrock and Amazon Location Service


Today, geospatial workflows typically consist of loading data, transforming it, and then producing visual insights like maps, text, or charts. Generative AI can automate these tasks through autonomous agents. In this post, we discuss how to use foundation models from Amazon Bedrock to power agents to complete geospatial tasks. These agents can perform various tasks and answer questions using location-based services like geocoding available through Amazon Location Service. We also share sample code that uses an agent to bridge the capabilities of Amazon Bedrock with Amazon Location. Additionally, we discuss the design considerations that went into building it.

Amazon Bedrock is a fully managed service that offers an easy-to-use API for accessing foundation models for text, image, and embedding. Amazon Location offers an API for maps, places, and routing with data provided by trusted third parties such as Esri, HERE, Grab, and OpenStreetMap. If you need full control of your infrastructure, you can use Amazon SageMaker JumpStart, which gives you the ability to deploy foundation models and has access to hundreds of models.

Solution overview

In the realm of large language models (LLMs), an agent is an entity that can autonomously reason and complete tasks with an LLM’s help. This allows LLMs to go beyond text generation to conduct conversations and complete domain-specific tasks. To guide this behavior, we employ reasoning patterns. According to the research paper Large Language Models are Zero-Shot Reasoners, LLMs excel at high-level reasoning, despite having a knowledge cutoff.

We selected Claude 2 as our foundational model from Amazon Bedrock with the aim of creating a geospatial agent capable of handling geospatial tasks. The overarching concept was straightforward: think like a geospatial data scientist. The task involved writing Python code to read data, transform it, and then visualize it in an interesting map. We utilized a prompting pattern known as Plan-and-Solve Prompting for this purpose.

Using a Plan-and-Solve strategy allows for multi-step reasoning and developing a high-level plan as the first task. This works well for our load, transform, and visualize workflow, and is the high-level plan our agent will use. Each of these subtasks are sent to Claude 2 to solve separately.

We devised an example task to create a price heatmap of Airbnb listings in New York. To plan a path to complete the task, the agent needs to understand the dataset. The agent needs to know the columns in the dataset and the type of data in those columns. We generate a summary from the dataset so the agent can plan for the task provided by the user, in this case, generating a heatmap.

Prerequisites

There are a few prerequisites to deploy the demo. You’ll need access to an AWS account with an access key or AWS Identity and Access Management (IAM) role with permissions to Amazon Bedrock and Amazon Location. You will need to create a map, a place index, and an Amazon Location API key using the Amazon Location console. You will also need access to either a local or virtual environment where Docker is installed. In our demonstration, we use an Amazon Elastic Compute Cloud (Amazon EC2) instance running Amazon Linux with Docker installed.

Read and summarize the data

To give the agent context about the dataset, we prompt Claude 2 to write Python code that reads the data and provides a summary relevant to our task. The following are a few of the prompts we included. The full list is available in the prompts.py file in the project.

Your Role: You are a geospatial data analyzer designed to analyze data schema from arbitrary geospatial datasets.
Your Task: You are provided a set of file URLs. You need to generate a Python function that meets the following:
    Read each file using geopandas. Each file could be CSV, shape file, or GeoJSON.
    Use built-in function resolved_file_url= get_data_file_url(file_url, session_id) to get downloadable URLs.

From these prompts, Claude 2 generated the following Python code:

import geopandas as gpd
import pandas as pd
import numpy as np
 
def analyze_geospatial_data(file_urls):
    dataframes = []
    for file_url in file_urls:
        resolved_file_url = get_data_file_url(file_url, 'a5b188fe-91f7-4f92-92fe-c4030721a427', 'local')
        if file_url.endswith('.csv'):
            df = pd.read_csv(resolved_file_url)
        elif file_url.endswith('.shp'):
            df = gpd.read_file(resolved_file_url)
        elif file_url.endswith('.parquet'):
            df = pd.read_parquet(resolved_file_url)
        elif file_url.endswith('.geojson'):
            df = gpd.read_file(resolved_file_url)
        else:
            raise ValueError('Unsupported file format')
 
        df = df.sample(n=3, random_state=42)
        dataframes.append({
            'file_url': file_url, 
            'resolved_file_url': resolved_file_url,
            'data_frame': df,
            'column_names': list(df.columns)
        })
    return dataframes
 
file_urls = ['agent://airbnb_listings_price.csv']
dataframes = analyze_geospatial_data(file_urls)

The agent:// prefix tells our CLI that this file is stored inside the session storage folder. Upon running the CLI, it will create a geospatial-agent-session-storage folder to store local data. The agent now recognizes the uploaded Airbnb data and its column names. Next, let’s ask Claude for some hints to generate a heatmap using these columns. We will pass in the following prompts.

Your Role: You are a geospatial data analyzer designed to analyze data schema from arbitrary geospatial data sets.
Your Task: You are provided with a table with some rows data. Your task is to generate a summary that describes the data in the table following the requirements below.
    The first sentence should be summary of the data in the table from the aspect of the user action.
    If there is no geometry column in the table, the second sentence should note column names that can be used to generate a geometry column in geopandas.

Claude 2 replies with a hint

Summary: The table contains columns for listing id, name, host id, host name, location details like neighborhood, latitude, and longitude that can be used to generate a geometry column for plotting. Price column contains listing price that can be used to generate a heatmap of listing prices.

Plan a solution

Now we can ask Claude 2 to draft a high-level plan. We can use a graph library for Python called NetworkX to map out the steps for solving our problem.

Similar to our last prompt, we pass in a few requirements to guide Claude 2:

Your Role: You are a geospatial data scientist and an expert Python developer.
Your Task: A series of consecutive steps.
Your Reply needs to meet these requirements:
    Create a single NetworkX graph.
    No disconnected components allowed.
    First operations are data loading.
    Use geopandas for spatial data.

From these prompts, Claude 2 generated the following Python code:

import networkx as nx
G = nx.DiGraph()
# Load the data
G.add_node("airbnb_data_url", node_type="data", data_path="agent://airbnb_listings_price.csv", 
           description="URL for Airbnb listing price data")
G.add_node("load_data", node_type="operation", operation_type="load",
           description="Load the Airbnb listing price data")
G.add_edge("airbnb_data_url", "load_data")
G.add_node("airbnb_df", node_type="data", 
           description="Pandas DataFrame containing Airbnb listing prices")
G.add_edge("load_data", "airbnb_df")
# Transform the data
G.add_node("add_geometry", node_type="operation", operation_type="transform",
           description="Create a geometry column using latitude and longitude")
G.add_edge("airbnb_df", "add_geometry")
G.add_node("airbnb_gdf", node_type="data",
           description="GeoDataFrame with geometry column added")
G.add_edge("add_geometry", "airbnb_gdf")
# Visualize the data
G.add_node("plot_heatmap", node_type="operation", operation_type="visualization",
           description="Generate a heatmap of listing prices")
G.add_edge("airbnb_gdf", "plot_heatmap")
G.add_node("price_heatmap", node_type="data", 
           description="The heatmap plot showing Airbnb listing prices")
G.add_edge("plot_heatmap", "price_heatmap")

In this NetworkX graph, Claude 2 breaks down the process into three main segments:

  • Loading data – Importing the Airbnb listing prices from a given URL into a Pandas DataFrame
  • Transforming data – Creating a geometry column based on the latitude and longitude coordinates
  • Visualizing data – Generating a heatmap to display the pricing of Airbnb listings

This approach allows for a clear and straightforward implementation of the geospatial task at hand. We can use GraphViz to visualize the following workflow.

Graph

Implement the plan

Now that Claude 2 has provided us with a plan, it’s time to bring it to life. For each step, we prompt Claude 2 to write the corresponding code. To keep Claude 2 focused, we supply high-level requirements for each task. Let’s dive into the code that Claude 2 generated for each individual phase.

Load the data

To load the Airbnb listing price data into a Pandas DataFrame, we create a prompt and pass in some parameters. The “Load Airbnb data” in the Operation_task is referencing the Load Data node in our graph we created earlier.

Your Role: You are a geospatial data scientist and an expert Python developer.
Operation_task: You need to generate a Python function to do: Load Airbnb data.
Your Reply needs to meet these requirements:
    The function definition is: 
    load_data(airbnb_data_url="agent://airbnb_listings_price.csv").
    When doing spatial joins, retain at least 1 geometry column.
    Point function requires importing shapely library.

From these prompts, Claude 2 generated the following Python code:

import pandas as pd
def load_data(airbnb_data_url="agent://airbnb_listings_price.csv"):
    """
    Load Airbnb listing price data
    """
    file_url = get_data_file_url(airbnb_data_url, session_id, storage_mode) 
    airbnb_df = pd.read_csv(file_url)
    return airbnb_df

Transform the data

Next, Claude 2 generates the code to add a geometry column to our DataFrame using latitude and longitude. For this prompt, we pass in the following requirements:

Your Role: You are a geospatial data scientist and an expert Python developer.
Operation_task: You need to generate a Python function to do: 
    Add geometry column from latitude/longitude.
    Your Reply needs to meet these requirements:
    The function definition is: add_geometry(filtered_df).
    The function return line is: return airbnb_gdf.

From these prompts, Claude 2 generated the following Python code:

import geopandas as gpd
from shapely.geometry import Point
def add_geometry(airbnb_df):
    """
    Add geometry column from latitude and longitude
    """
    airbnb_df['geometry'] = airbnb_df.apply(lambda row: Point(row['longitude'], row['latitude']), axis=1)
    airbnb_gdf = gpd.GeoDataFrame(airbnb_df, geometry='geometry')
 
    return airbnb_gdf

Visualize the data

Finally, Claude 2 builds a heatmap visualization using pydeck, which is a Python library for spatial rendering. For this prompt, we pass in the following requirements:

Your Role: You are a geospatial data scientist and an expert Python developer.
Operation_task: You need to generate a Python function to do: Generate heatmap from price.
Your Reply needs to meet these requirements:
    The function definition is: generate_heatmap(airbnb_gdf).
    The function return line is: return airbnb_heatmap.

From these prompts, Claude 2 generated the following Python code:

import pandas as pd
import geopandas as gpd
import pydeck as pdk
 
def plot_heatmap(airbnb_gdf):
    """
    Plot heatmap of listing prices
    """
    price_heatmap = pdk.Deck(
        map_style=location_map_style(),
        initial_view_state=pdk.ViewState(
            latitude=airbnb_gdf['latitude'].mean(),
            longitude=airbnb_gdf['longitude'].mean(),
            zoom=11,
            pitch=50,
        ),
        layers=[
            pdk.Layer(
                'HexagonLayer',
                data=airbnb_gdf,
                get_position=['longitude', 'latitude'],
                radius=100,
                elevation_scale=4,
                elevation_range=[0, 1000],
                pickable=True,
                extruded=True,
            ),
            pdk.Layer(
                'ScatterplotLayer',
                data=airbnb_gdf,
                get_position=['longitude', 'latitude'],
                get_color="[200, 30, 0, 160]",
                get_radius=200,
            ),
        ],
    )
 
    # Save heatmap HTML
    price_heatmap.to_html(get_local_file_path('airbnb_heatmap.html', session_id, task_name))
 
    return price_heatmap

When Claude 2 returns a response, it also includes some helpful notes explaining how each function meets the provided requirements. For example, for the heatmap visualization, Claude 2 noted the following:

"This function generates a heatmap of Airbnb listing prices using pydeck and saves the resulting HTML locally. It fulfills the requirements specified in the prompt."

Assemble the generated code

Now that Claude 2 has created the individual building blocks, it’s time to put it all together. The agent automatically assembles all these snippets into a single Python file. This script calls each of our functions in sequence, streamlining the entire process.

The final step looks like the following code:

session_id = "a5b188fe-91f7-4f92-92fe-c4030721a427"
task_name = "1694813661_airbnb_listings_price_heatmap"
storage_mode = "local"
# Sequentially invoke the functions
airbnb_df = load_data(airbnb_data_url="agent://airbnb_listings_price.csv")
airbnb_gdf = add_geometry(airbnb_df)
price_heatmap = plot_heatmap(airbnb_gdf)

After the script is complete, we can see that Claude 2 has created an HTML file with the code to visualize our heatmap. The following image shows New York on an Amazon Location basemap with a heatmap visualizing Airbnb listing prices.

Heat Map Visualization

Use Amazon Location with Amazon Bedrock

Although our Plan-and-Solve agent can handle this geospatial task, we need to take a slightly different approach for tasks like geocoding an address. For this, we can use a strategy called ReAct, where we combine reasoning and acting with our LLM.

In the ReAct pattern, the agent reasons and acts based on customer input and the tools at its disposal. To equip this Claude 2-powered agent with the capability to geocode, we developed a geocoding tool. This tool uses the Amazon Location Places API, specifically the SearchPlaceIndexForText method, to convert an address into its geographic coordinates.

Agent: Hi! I'm Agent Smith, your conversational geospatial assistant. How can I assist you today?
You: >? Hello, can you give me the coordinates for 112 E 11th St, Austin, TX 78701?
Agent: The coordinates for 112 E 11th St, Austin, TX 78701 are longitude -97.740590981087 and latitude 30.274118017533.

Within this brief exchange, the agent deciphers your intent to geocode an address, activates the geocoding tool, and returns the latitude and longitude.

Whether it’s plotting a heatmap or geocoding an address, Claude 2 combined with agents like ReAct and Plan and Solve can simplify geospatial workflows.

Deploy the demo

To get started, complete the following steps:

  1. Clone the following repository either to your local machine or to an EC2 instance. You may need to run aws configure --profile <profilename> and set a default Region; this application was tested using us-east-1.
git clone https://github.com/aws-samples/amazon-location-geospatial-agent/

Now that we have the repository cloned, we configure our environment variables.

  1. Change directories into the cloned project folder:
cd amazon-location-geospatial-agent

  1. Edit the .env file using your preferred text editor:
  1. Add your map name, place index name, and API key:
API_KEY_NAME=AgentAPIKey
MAP_NAME=AgentMap
PLACE_INDEX_NAME=AgentPlaceIndex

  1. Run the following command to build your container:
  1. Run the following command to run and connect to your Docker container:
docker run --rm -it -v ~/.aws:/root/.aws --entrypoint bash agent

  1. Grab the Airbnb dataset:
apt install -y wget
wget http://data.insideairbnb.com/united-states/ny/new-york-city/2023-10-01/visualisations/listings.csv
cp listings.csv data/listings.csv

  1. Run the following command to create a session. We use sessions to isolate unique chat environments.
SESSION_ID="3c18d48c-9c9b-488f-8229-e2e8016fa851" FILE_NAME="listings.csv" make create-session

Now you’re ready to start the application.

  1. Run the following command to begin the chat application:
poetry run agent --session-id 3c18d48c-9c9b-488f-8229-e2e8016fa851 --profile <profilename>

You will be greeted with a chat prompt.

  1. You can begin by asking the following question:
I've uploaded the file listings.csv. Draw a heatmap of Airbnb listing price.

The agent grabs the Airbnb_listings_price.csv file we have downloaded to the /data folder and parses it into a geospatial DataFrame. Then it generates the code to transform the data as well as the code for the visualization. Finally, it creates an HTML file that will be written in the /data folder, which you can open to visualize the heatmap in a browser.

Another example uses the Amazon Location Places API to geocode an address. If we ask the agent to geocode the address 112 E 11th St, Austin, TX 78701, we will get a response as shown in the following image.

Example Interaction

Conclusion

In this post, we provided a brief overview of Amazon Bedrock and Amazon Location, and how you can use them together to analyze and visualize geospatial data. We also walked through Plan-and-Solve and ReAct and how we used them in our agent.

Our example only scratches the surface. Try downloading our sample code and adding your own agents and tools for your geospatial tasks.


About the authors

Jeff Demuth is a solutions architect who joined Amazon Web Services (AWS) in 2016. He focuses on the geospatial community and is passionate about geographic information systems (GIS) and technology. Outside of work, Jeff enjoys traveling, building Internet of Things (IoT) applications, and tinkering with the latest gadgets.

Swagata Prateek is a Senior Software Engineer working in Amazon Location Service at Amazon Web Services (AWS) where he focuses on Generative AI and geospatial.



Source link

asad
the authorasad

Leave a Reply