brew install libiodbc
Recently we switched our data warehouse at work to Snowflake and have been pretty happy with the speed boost it provides. As an aside, we’ve been less happy with the insistence of ALL CAPS for column names. Anyway, I had already built an internal R package for our Postgres setup, so I ventured to make it work with Snowflake. In short, it took some doing and I wanted to share what I found worked best in case you find yourself in a similar situation.
The Rub
Snowflake does offer a tutorial on connecting R (including dplyr!) to their service, but it depends on rJava and RJODBC, which was overhead I wanted to avoid. Luckily, Snowflake also offers straight ODBC drivers too!
The main challenge with using these drivers has been choosing between the two software packages that are available to maintain ODBC drivers: unixodbc and iODBC. Snowflake suggests using iODBC, but may support unixODBC while R Studio suggests unixODBC, by may support iODBC. After trying out both, I have found that the iODBC route is less painful. While I was able make everything work with unixODBC, it involved editing config files by hand. Which may be fine for you and I, but I have less technical folks at my company to worry about.
Happily, the iODBC route works pretty smoothly, so that is what I will outline here. Note that it apparently causes trouble to have both installed, so you might want to remove unixODBC first if you already have it installed.
For reference, if entering odbcinst -j
into your console works then you have unixODBC installed. If iodbctest
works then you have iODBC installed.
iODBC
Most of these steps can be done by visiting the relevant websites, downloading the package installers, double clicking, etc., but we’ll outline how to do it from the command line here.
First, we need to install iODBC
Next, we download and install the latest odbc drivers from Snowflake. Note you may need to change the version number of the drivers to match what is in latest/
.
# Download file to current working directory
curl -O https://sfc-repo.snowflakecomputing.com/odbc/mac64/latest/snowflake_odbc_mac-2.20.3.dmg
# Run .pkg
hdiutil attach snowflake_odbc_mac-2.20.3.dmg
sudo installer -package /Volumes/snowflakeODBC/snowflakeODBC.pkg -target /
hdiutil detach /Volumes/snowflakeODBC/
Data Source Names
With the drivers installed, we can now store your Snowflake credentials as a Data Source Name (DSN) to make it easy to connect from R in your code.
First, we tell iODBC where to look for your DSNs. Entering the following in your console adds a line to the bottom of your .zshrc
that sets a persistent environment variable. If you are using bash, replace .zshrc
with .bashrc
.
echo 'export ODBCINI=$HOME/.odbc.ini' >> .zshrc
#echo 'export ODBCINI=$HOME/.odbc.ini' >> .bashrc
You will need to restart your terminal for this to take effect.
Create or edit your DSN files.
There should now be a .odbc.ini
and .odbcinst.ini
in your home directory. If not, create them!
Your .odbcinst.ini
should look similar to this:
[ODBC Drivers]
Snowflake = Installed
[Snowflake]
Driver = /opt/snowflake/snowflakeodbc/lib/universal/libSnowflake.dylib
Your .odbc.ini
should look like the following. Note that you can change the name in the []
(in the example it’s snowflake
) to whatever you want to use to call this connection in your code.
[ODBC Data Sources]
snowflake = Snowflake
[snowflake]
Driver = /opt/snowflake/snowflakeodbc/lib/universal/libSnowflake.dylib
Description = Typical user settings for Nick
UID = <your_username>
PWD = <your_password>
Server = <your_server>.snowflakecomputing.com
Role = <your_snowflake_role>
Database = <your_database>
Warehouse = <your_warehouse>
Schema = <your_default_schema>
Locale = en-US
Tracing = 0
odbc R Package
The last step is to install the odbc
R package from source. The install needs to be from source so that it can catch you are using iODBC and adjust.
install.packages("odbc", type = "source")
At this point you can restart your R session and run the following commands to make sure R sees your connection options.
::odbcListDrivers()
odbc::odbcListDataSources() odbc
Finally, we can connect and try out some queries.
<- DBI::dbConnect(odbc::odbc(), "snowflake") con
Future
In future posts I hope to share some custom dbplyr translations of lubridate / stringr functions I created to work with Snowflake’s functions. Until then, I hope this helps!