Creating a database retrieval agent with ellmer and dbplyr

LLM

Database

Tutorial

Creating an agent to retrieve data from a database using ellmer and dbplyr.

Author

Kent Orr

Published

October 4, 2025

For every 10 requests I get as an analyst, 9 of them are retrieving something that’s a pain in Excel but easy in SQL. The other 1 is something that has an actual predictive or analytical component. At least for me, it’s typically those 9 that are the time sensitive items.

In a lot of ways, I think of my role at work as a data analyst as democratizing access to data. The more I can empower the company, the better. A lot of the time I end up empowering the executive level because that’s who’s request gets priority. But there’s value in a bottom up approach, allowing sales and operational teams to make their own strategic decisions. After all, nobody knows how to improve warehouse pick module performance better than the pickers, especially when they’re given easy access to the metrics that decide if they’re getting a raise or not.

If we think of the working model as:

I receive a human language request to answer a business problem.
I convert that human language request to code, often a SQL query.
That code gets accompanied by an explanation, usually a quart doc or chart.
The code gets run, and the results are shared with the human requestor.

Rinse and repeat.

This is perfect fodder for a junior analyst. Or, I could code up an LLM agent that can deliver the entire request in a few seconds at a fraction of a cent. I know which one my boss wants, even if it does make me wonder what he future of our species will look like.

The Stack

I have read access to a database, an LLM API subscription, and an R Shiny server.

My plan:

I will use the ellmer package to create an LLM agent.
That agent will request dbplyr code that matches the database schema.
R will run the dbplyr code, and a preview of the results wil be given to the LLM agent.
The human requestor will then be able to review the preview, make adjustments, and ask for additional context.
If the human requestor is satisfied with the results, they can retrieve the full query.

My concerns:

Using LLMs right now is like riding a horse. Just because you move the reigns and kick your heels doesn’t mean that the horse will respond exactly the way you want. You’ll get there, but you don’t have fine control over where the hooves go or when the tail pops up to drop a load. Even worse, a horse can wander away with your saddle bags still attached. So I want to be sure that the environment is set up so that my LLM cannot execute malicious code. And maybe malicious is a bit of a dramatic term, but the point is we want to keep permissions as tight as possible to avoid unintended consequences. Think of the system2 function in R. It’s a powerful tool, but it’s also a potential security risk.

There’s also the matter requesting too large a dataset. In my workplace the sales data comprises hundreds of millions of records. select * or other long running queries are a real problem that must be avoided.

Setting Up the Demo

To make this a lightweight demo, I’m going to generate an sqlite dummy database.

set.seed(123)
customers <- data.frame(int_id = 1:15) |>
    mutate(
        name = ch_name(15),
        group_code = sample(c("Red", "Green", "Blue"), 15, TRUE)
        )
head(customers)

  int_id                    name group_code
1      1     Matias Barrows-Jast      Green
2      2         Kayson Franecki        Red
3      3       Garrick Langworth        Red
4      4         Cordell Dickens       Blue
5      5          Destiney Dicki        Red
6      6 Mrs. Freddie Pouros DDS      Green

purchases = lapply(1:100, \(x) {    
    customer_id = sample(1:15, 1)
    customer = customers[customer_id,]
    
    price = if (customer$group_code == 'Green') {
        rnorm(1, 50, 10)
    } else if (customer$group_code == 'Red') {
        rnorm(1, 90, 10)
    } else if (customer$group_code == 'Blue') {
        rnorm(1, 92, 10)
    } |> round(2)

    data.frame(customer_id = customer_id, price = price)       

}) |>
    bind_rows()

head(purchases)

  customer_id     price
1           8  79.35000
2           7 102.42919
3           1  38.76891
4           1  54.03631
5          13  57.79965
6          15  92.29000

conn = dbConnect(
    RSQLite::SQLite(),
    dbname= ":memory:"
)

dbWriteTable(conn, 'customers', customers)
dbWriteTable(conn, 'purchases', purchases)

Creating our Ellmer Tooling

Ultimately we’re going to ask our LLM to generate a string that represents dplyr code. Here’s a basic version of that for our example:

evaluate_dbplyr <- function(code) {
    customers = tbl(conn, 'customers')
    purchases = tbl(conn, 'purchases')

    # Parse and evaluate the code string to create a dbplyr query object
    query_obj <- eval(parse(text = code))

    # Convert the dbplyr query to SQL and execute
    sql <- dbplyr::sql_render(query_obj, conn) |>
        suppressWarnings()
    result <- DBI::dbGetQuery(conn, sql)

    capture.output(head(result))
}

evaluate_dbplyr <- tool(
    evaluate_dbplyr,
    name = 'evaluate_dbplyr',
    description = 'converts the provded dplyr code string into sql and runs the query. The head of the result set is returned.',
    arguments = list(
        code = type_string("The dplyr code string to be evaluated.")
    )
)

Warning

Any time you’re running eval(parse(text = code)) with code that you did not write, you’re opening up the door to arbitrary code execution. We’ll talk about mitigation strategies in the next section, but for now let’s get the basic functionality working.

And now we’re going to develop a prompt for the LLM to give it some basic information about our data:

prompt = "You are working with a database that contains two tables: customers and purchases. The customers table has columns int_id, name, and group_code. The purchases table has columns customer_id and price. Your task is to write dplyr code and execute it using the available tool to meet user requests. The tables are represented as tbl() objects, so you need only pass them as arguments to the dplyr pipe. For example, to get the average price of purchases by group code, you would use the following code:
    purchases |>
    left_join(customers, by = c(customer_id = 'int_id')) |>
    group_by(group_code) |>
    summarise(avg_price = mean(price))

You must use the evaluate_database_query tool to execute the code and return the results to the user.
"

chat = chat_openai(prompt, model='gpt-4o')
chat$register_tool(evaluate_dbplyr)

chat$chat('What is the total number of purchases by group code?')

◯ [tool call] evaluate_dbplyr(code = "purchases |> left_join(customers, by =
c(customer_id = 'int_id')) |> group_by(group_code) |> summarise(total_purchases
= n())")

● #> group_code total_purchases

  #> 1 Blue 28

  #> 2 Green 36

  #> 3 Red 36

The total number of purchases by group code is as follows:

- Blue: 28 purchases
- Green: 36 purchases
- Red: 36 purchases

Securing the Environment

The warning above about arbitrary code execution is real. When we use eval(parse(text = code)), we’re essentially giving the LLM the ability to run any R code it wants. This includes potentially malicious operations like system(), file.remove(), or accessing sensitive data.

While we can’t make this completely bulletproof, we can significantly increase the difficulty for malicious code by creating a restricted environment that only exposes the functions we actually need.

# Helper function to create a restricted environment for secure evaluation
create_restricted_env <- function(connection) {
    restricted_env <- new.env(parent = emptyenv())

    # Add table connections
    restricted_env$tbl <- tbl
    restricted_env$customers <- tbl(connection, 'customers')
    restricted_env$purchases <- tbl(connection, 'purchases')

    # dplyr functions
    restricted_env$select <- dplyr::select
    restricted_env$filter <- dplyr::filter
    restricted_env$mutate <- dplyr::mutate
    restricted_env$group_by <- dplyr::group_by
    restricted_env$summarise <- dplyr::summarise
    restricted_env$summarize <- dplyr::summarize
    restricted_env$arrange <- dplyr::arrange
    restricted_env$left_join <- dplyr::left_join
    restricted_env$inner_join <- dplyr::inner_join
    restricted_env$right_join <- dplyr::right_join
    restricted_env$full_join <- dplyr::full_join
    restricted_env$count <- dplyr::count
    restricted_env$distinct <- dplyr::distinct

    # Statistical functions
    restricted_env$mean <- base::mean
    restricted_env$sum <- base::sum
    restricted_env$min <- base::min
    restricted_env$max <- base::max
    restricted_env$n <- dplyr::n
    restricted_env$sd <- stats::sd
    restricted_env$head <- utils::head
    restricted_env$desc <- dplyr::desc

    # Pipe operator
    restricted_env$`%>%` <- magrittr::`%>%`

    # Essential operators
    restricted_env$`==` <- base::`==`
    restricted_env$`!=` <- base::`!=`
    restricted_env$`>` <- base::`>`
    restricted_env$`<` <- base::`<`
    restricted_env$`>=` <- base::`>=`
    restricted_env$`<=` <- base::`<=`
    restricted_env$`&` <- base::`&`
    restricted_env$`|` <- base::`|`
    restricted_env$`+` <- base::`+`
    restricted_env$`-` <- base::`-`
    restricted_env$`*` <- base::`*`
    restricted_env$`/` <- base::`/`
    restricted_env$c <- base::c

    restricted_env
}

evaluate_dbplyr_secure <- function(code) {
    # Create restricted environment using helper
    restricted_env <- create_restricted_env(conn)

    # Parse and evaluate in the restricted environment
    query_obj <- eval(parse(text = code), envir = restricted_env)

    # Convert to SQL and execute
    sql <- dbplyr::sql_render(query_obj, conn) |> suppressWarnings()
    result <- DBI::dbGetQuery(conn, sql)

    capture.output(head(result))
}

evaluate_dbplyr_secure <- tool(
    evaluate_dbplyr_secure,
    name = 'evaluate_dbplyr_secure',
    description = 'converts the provided dplyr code string into sql and runs the query in a restricted environment. The head of the result set is returned.',
    arguments = list(
        code = type_string("The dplyr code string to be evaluated.")
    )
)

Security Limitations

This explicit whitelist approach significantly reduces the attack surface, but it’s not perfect:

Operators can be powerful: Even basic operators like + or / can potentially be used creatively for unintended purposes
Memory and CPU: We don’t limit computational resources, so infinite loops or memory exhaustion are still possible
SQL injection: While we’re using dbplyr which should parameterize queries, complex expressions might still create vulnerabilities
Function composition: Combinations of allowed functions might create unexpected behaviors

The advantage of this approach is that we have complete control over what’s available - no dangerous functions slip through because we explicitly add every single function that’s allowed.

For production use, consider additional protections like: - Process sandboxing (running in a container or separate process) - Resource limits (timeout, memory caps) - Query result size limits - Logging and monitoring of all executed code - More sophisticated AST parsing to validate code before execution

Let’s test our secure version:

chat_secure = chat_openai(prompt, model='gpt-4o')
chat_secure$register_tool(evaluate_dbplyr_secure)

chat_secure$chat('Show me the top 3 customers by total purchase amount')

◯ [tool call] evaluate_dbplyr_secure(code = "purchases |>\n
group_by(customer_id) |>\n summarise(total_purchase = sum(price, na.rm = TRUE))
|>\n arrange(desc(total_purchase)) |>\n head(3) |>\n left_join(customers, by =
c(customer_id = 'int_id')) |>\n select(name, total_purchase)")

● #> name total_purchase

  #> 1 Keyshawn Schaefer 1011.2604

  #> 2 Destiney Dicki 870.9362

  #> 3 Ferdinand Bergstrom 678.7800

Here are the top 3 customers by total purchase amount:

1. **Keyshawn Schaefer** - $1011.26
2. **Destiney Dicki** - $870.94
3. **Ferdinand Bergstrom** - $678.78

Creating a Database Chat Class

Let’s extend ellmer’s Chat class and reuse our secure evaluation function:

# Create a new R6 class that inherits from Chat
DatabaseChat <- R6::R6Class(
    "DatabaseChat",
    inherit = ellmer:::Chat,
    public = list(
        connection = NULL,
        last_query = NULL,
        last_result = NULL,

        initialize = function(connection, prompt, model = 'gpt-4o', ...) {
            # Create an OpenAI provider for the Chat class
            provider <- ellmer:::ProviderOpenAI(
                name = "OpenAI",
                base_url = "https://api.openai.com/v1",
                model = model,
                params = ellmer::params(),
                extra_args = list(),
                extra_headers = character(),
                credentials = ellmer:::as_credentials("chat_openai", ellmer:::openai_key),
                service_tier = "auto"
            )
            super$initialize(provider = provider, system_prompt = prompt)
            self$connection <- connection

            # Wrap the secure function to store results
            evaluate_db_with_storage <- function(code) {
                # Reuse the helper function to create restricted environment
                restricted_env <- create_restricted_env(self$connection)

                # Parse and evaluate in restricted environment
                query_obj <- eval(parse(text = code), envir = restricted_env)

                # Convert to SQL and execute
                sql <- dbplyr::sql_render(query_obj, self$connection) |> suppressWarnings()
                self$last_result <- DBI::dbGetQuery(self$connection, sql)
                self$last_query <- query_obj

                capture.output(head(self$last_result))
            }

            # Register the tool
            self$register_tool(tool(
                evaluate_db_with_storage,
                name = 'evaluate_database_query',
                description = 'Converts dplyr code to SQL and executes it.',
                arguments = list(code = type_string("The dplyr code string."))
            ))
        },

        get_full_results = function() self$last_result,
        get_query = function() self$last_query,
        collect_results = function() collect(self$last_query)
    )
)

Now we can create instances of our DatabaseChat class:

# Create a new database chat instance
db_chat <- DatabaseChat$new(
    connection = conn,
    prompt = prompt
)

db_chat$chat('What is the average purchase amount by group code?')

The average purchase amount by group code is as follows:

- Blue: 92.58
- Green: 49.29
- Red: 90.59

And access the full results using our new methods:

# Get the full result set
full_results <- db_chat$get_full_results()
print(full_results)

  group_code avg_price
1       Blue  92.57857
2      Green  49.28874
3        Red  90.59341

# Get the dbplyr query object
query_obj <- db_chat$get_query()
query_obj |>
    mutate(half_price = avg_price/2) |>
    print()

# Source:   SQL [?? x 3]
# Database: sqlite 3.51.1 [:memory:]
  group_code avg_price half_price
  <chr>          <dbl>      <dbl>
1 Blue            92.6       46.3
2 Green           49.3       24.6
3 Red             90.6       45.3

# Use collect() for lazy evaluation
collected_results <- db_chat$collect_results()
print(collected_results)

# A tibble: 3 × 2
  group_code avg_price
  <chr>          <dbl>
1 Blue            92.6
2 Green           49.3
3 Red             90.6