Use Custom Charts in CockroachDB to Troubleshoot Issues Faster

I was working on a follow-up to load balancer comparison post.  This required a series of tests and examining numerous time-series graphs.  Initially, I spent more than the expected amount of time correlating the disparate graphs manually.  This post will describe CockroachDB Admin UI Custom Charts I ended up using. The steps should take just a couple of minutes to follow and should yield tremendous productivity.

As a background, CockroachDB Admin UI has prearranged grouping of times series graphs.  These dashboards are on separate pages and as a result, the graphs I needed were spread across many different pages. This caused some inefficiencies.  Flipping back and forth the dashboards and choosing the right criteria from pulldown menus each time was time-consuming. Remembering the info from different dashboards became difficult.   In the end, it was hard to draw conclusions from the various tests.

What would be better is to have the relevant graphs in one place correlated with each other. This way, it is easy to correlate graphs.  Easy to compare expected vs actual results. Easy to describe and validate the results.  Luckily, CockroachDB Admin UI provides functionality to customize graphs. This feature is called Customized Charts. I will describe simple steps to:

  • Find the available times series data
  • Create customized charts
  • Use the customized charts
  • Share and reuse the customized charts

Step 1: Find the available times series metrics

CockroachDB Admin UI Overview pages list over 200 time-series metrics that can be used in a Customized Chart.  For for the load balancer follow up post, I need to examine:

  • CPUs utilization is in the normal range to ensure correct results
  • Total number of connections match the test scenario
  • Connections are spread evenly among the CockroachDB nodes
  • TPS increases with the increased concurrency

The following are the Metrics Names of interest for this post.  These metrics can also be viewed on the standard CockroachDB Admin UI dashboards:  Overview Dashboard, Hardware Dashboard, and SQL Dashboard as a reference.

  • CPU information will be provided by sys.cpu.combined.percent-normalized
  • Connection information will be provided by sql.conns
  • TPS information will be provided by sql.select.count and sql.update.count

Step 2: Create Customized Charts

Follow the documentation to land at Custom Chart debug page in the Admin UI and follow the steps below.  The number below (1 thru 4) represents the area on the graphs.

  1. Click Add Chart four times, for each custom graph.  
  2. Pick the METRIC NAME from the pulldown menu for the first graph
  3. Pick the METRIC NAME  from the pulldown menu for the second graph.  Note how the same metrics can be used multiple times
  4. Choose PER NODE

custome chart.png

Step 3: Use Customized Charts

As the metrics are added, the following charts will be displayed in real-time.  The below example chart shows a 10-minute sample. The graphs are manually annotated for illustration purposes showing 4 test scenarios (app, JDBC, HAProxy, app w/ 1 thread) and the four criteria defined from Step 1: Find the available times series metrics.  With this customized chart, the 16 different outputs can be compared easily.

custom chart.png
Step 4: Share and Reuse the Customized Charts

To share and reuse the dashboard, save the URL of the customized chart.  The definition of the chart is encoded in the URL. Below is the encoded URL of the above chart.  Simply change the address highlighted in the Admin UI URL. (ie: change localhost:26258 to the target IP:Port)

http://localhost:26258/#/debug/chart?charts=…

Summary

There we have it. We showed it was easy to create a customized chart. We consolidated 3 dashboards down to a single chart. Used the chart to quickly validate a test. We saved the customized chart definition for share and reuse.

Posted in Uncategorized | Leave a comment

Getting started with CockroachDB in R

CockroachDB is known for handling terabytes of mission-critical transactions where the data and users are globally distributed.  R is known for “statistical computing and graphics” where the data is in the RAM of a single system used by a single user.  With CockroachDB’s recent vectorized SQL engine, R users can take advantage of CockroachDB’s globally distributed data.  

Screen Shot 2020-01-02 at 9.06.36 PM

CockroachDB is built to be largely compatible with PostgreSQL, meaning that software written to use PostgreSQL can sometimes (often!) be used with CockroachDB without changes.  This means R’s existing Postgres DBI interface should work with CockroachDB.

The original inspiration was taken from Getting started with PostgresSQL in R from R-bloggers.  Changes were made to reflect the use of RPostgres, CockroachDB MovR sample application, and CockroachDB.

Start CockroachDB and MovR

Open a terminal and start the following.  We will start a simple single-node CockroachDB.  Then start the MovR application running in the background while we work on R.  

cockroach start --insecure --port=26257 --http-port=26258 --store=cockroach-data/1 --cache=256MiB --background
cockroach workload init movr 'postgresql://root@localhost:26257?sslmode=disable'
cockroach workload run movr --concurrency 1

Start R with CockroachDB and ggplot2

Open a second terminal and start the following.  R DBI interface and ggplot2 setup are the same. Note the default port number on CockroachDB is 26257 (highlighted below).

# Install the latest RPostgres release from CRAN:
install.packages("RPostgres")
# Install ggplot and related to graph MovR activities
install.packages("dplyr")
install.packages("ggplot2")
# Load the libraries
require("ggplot2")
library(DBI)

# Connect to the default CockroachDB instance and MovR database
con <- dbConnect(RPostgres::Postgres(),dbname = "movr",
                 host = "localhost", port = 26257,
                 user = "root")

Differences between CockroachDB and Postgres

CockroachDB uses 64-bit integer whereas Postgres uses 32-bit integer.  ggplot2 will complain regarding the integer64.  The fix on ggplot2 can be tracked via #2377.  A workaround, for now, is to cast 64-bit integer to 32-bit integer using ::int4 in CockroachDB or as.integer in R.  In the R CLI, the following will demonstrate both workarounds.

# below will return CockroachDB default 64 bit integer
> df_users_64 <- dbGetQuery(con, "select city, count(*) from users group by city;")
> str(df_users_64)
'data.frame': 9 obs. of  2 variables:
 $ city : chr  "amsterdam" "boston" "los angeles" "new york" ...
 $ count:integer64 1897 1924 1938 1882 1938 1902 1929 1893 ...

# below will return count with Postgres default 32 bit integer
> df_users <- dbGetQuery(con, "select city, count(*)::int4 from users group by city;")
> str(df_users)
'data.frame': 9 obs. of  2 variables:
 $ city : chr  "amsterdam" "boston" "los angeles" "new york" ...
 $ count: int  1986 2009 2029 1963 2021 1989 2019 1960 2023

# df_users has count as int4 which maps to R integer
> ggplot(df_users, aes(city,count,fill=city)) + geom_col()

# df_users_64 has count as integer64.  Use as.integer to convert 64 bit integer
> ggplot(df_users_64, aes(city,as.integer(count),fill=city)) + geom_col()
Posted in cockroachdb, R | Tagged , , , , , , | Leave a comment

Compare application, JDBC, and HAProxy load balancers on CockroachDB

With CockroachDB, all nodes are equal.  Each CockroachDB node can run any SQL (DCL, DDL, DML, and TCL).  This greatly simplifies the load balancer configuration used when connecting to CockroachDB.  The simplicity opens up a plethora of load balancer options. 

Three classes of load balancer options are application-based, driver, and external load balancers.  As a general rule, an application-based load balancer could be a fit for specific situations. The driver load balancer is simple to use and quick to get started.  External load balancers take a bit more setup but will provide greater flexibility.  

The choice of infrastructure impacts performance greatly.  The choice of load balancer impacts performance greatly as well.  We will save a performance comparison for another post.

For a simple demonstration, we will use open-source YCSB to show how the three classes of load balancer can be used with CockroachDB.   YCSB has a built-in load balancer feature. JDBC driver used by YCSB JDBC DB binding has a load balancer.  External load balancer, HAProxy, configuration is supported by CockroachDB.   

Load Balancer Topology

The below topology diagrams show two basic differences in the load balancer topology.  The diagram on the left shows the application and JDBC load balancer. The diagram on the right shows an external HAProxy load balancer.  Other documented external load balancers can substitute the HAProxy in the right diagram.    

Application and JDBC Load Balancer External HAProxy Load Balancer
Screen Shot 2020-01-02 at 3.36.07 PM Screen Shot 2020-01-02 at 3.36.07 PM

Load Balancer Configuration

Application-based load balancing on YCSB 0.17.0 via PR #1233 will be used.  The below syntax shows the YCSB based load balancer used via the YCSB parameter -p db.url 

jdbc:postgresql://host1:port1/database;jdbc:postgresql://host2:port2/database

Postgres JDBC driver load balancing feature is called Connection Fail-over.  CockroachDB is a Postgres wire protocol compatible database and as such, this feature can be used.  The below syntax shows the JDBC driver based load balancer used via the YCSB parameter -p db.url

jdbc:postgresql://host1:port1,host2:port2/database

An external load balancer such as HAProxy uses a single virtual IP (VIP) that is used by the load balancer to target multiple nodes.  CockroachDB has a command to generate the required HAProxy configuration file. The below syntax shows the single VIP used via the YCSB parameter -p db.url 

jdbc:postgresql://host1:port1/database

Load Balancer Usage Comparison

Each CockroachDB node can run any SQL (DCL, DDL, DML, and TCL).  YCSB is used to demonstrate how any CockroachDB node can accept DDL (create) and DML (insert, update, select).   To prove this:

  • YCSB site.ycsb.db.JdbcDBCreateTable will be used to create the table (DDL) load balanced
  • YCSB load will be used to insert the data (DML) load balanced
  • YCSB run workload A will be used to update and select (DML) load balanced

Component Install and Setup

Below is an instruction that can be used to demonstrate the techniques on a laptop (tested on OSX).  The general steps are to setup:

  1. CockroachDB 
  2. YCSB
  3. JDBC Driver
  4. HAProxy 

CockroachDB

Below is a copy and paste of Start a Local Cluster (Insecure) for convenience.

cockroach start \
--insecure \
--store=node1 \
--listen-addr=localhost:26257 \
--http-addr=localhost:8080 \
--join=localhost:26257,localhost:26258,localhost:26259 \
--background

cockroach start \
--insecure \
--store=node2 \
--listen-addr=localhost:26258 \
--http-addr=localhost:8081 \
--join=localhost:26257,localhost:26258,localhost:26259 \
--background

cockroach start \
--insecure \
--store=node3 \
--listen-addr=localhost:26259 \
--http-addr=localhost:8082 \
--join=localhost:26257,localhost:26258,localhost:26259 \
--background

YCSB

Copy and paste of YCSB Install with modifications to install only JDBC DB Binding. 

curl --location https://github.com/brianfrankcooper/YCSB/releases/download/0.17.0/ycsb-jdbc-binding-0.17.0.tar.gz | gzip -dc - | tar -xvf -
export YCSB=~/ycsb-jdbc-binding-0.17.0

Postgres JDBC

Place the Postgres JDBC driver in the YCSB lib directory.  

cd $YCSB/lib; curl -O --location https://jdbc.postgresql.org/download/postgresql-42.2.4.jar

HAProxy

The original Cockroach Labs instruction with added steps to change the VIP port number to 26256 for the purpose of the demonstration.

cockroach gen haproxy --insecure
sed -i.bak -e 's/bind :26257/bind :26256/g' haproxy.cfg
haproxy -f haproxy.cfg &

Running Tests

Below are steps for the application, JDBC and HAproxy load balancers.  Please note the following:

  • db.url is highlighted to show how each balancer is used
  • threadcount=3 to match the number of CockroachDB nodes.  This will ensure there is one connection per CockroachDB used by a round-robin connection.  The application load balancer will automatically create one thread per node.
  • recordcount=1000000 will create 1 million rows
  • requestdistribution=uniform will ensure all of the nodes get used equally
  • operationcount=300000 will run YCSB workload A this many times

Application Load Balancer

export YCSB=~/ycsb-jdbc-binding-0.17.0

cd $YCSB

cockroach sql --insecure -e "drop table if exists usertable"

java -cp lib/jdbc-binding-0.17.0.jar:lib/postgresql-42.2.4.jar site.ycsb.db.JdbcDBCreateTable -p db.driver=org.postgresql.Driver -p db.url="jdbc:postgresql://127.0.0.1:26257/defaultdb?autoReconnect=true&sslmode=disable&ssl=false&reWriteBatchedInserts=true;jdbc:postgresql://127.0.0.1:26259/defaultdb?autoReconnect=true&sslmode=disable&ssl=false&reWriteBatchedInserts=true;jdbc:postgresql://127.0.0.1:26261/defaultdb?autoReconnect=true&sslmode=disable&ssl=false&reWriteBatchedInserts=true" -p db.user=root -p db.passwd="" -n usertable 

bin/ycsb load jdbc -s -P workloads/workloada -p db.driver=org.postgresql.Driver -p db.url="jdbc:postgresql://127.0.0.1:26257/defaultdb?autoReconnect=true&sslmode=disable&ssl=false&reWriteBatchedInserts=true;jdbc:postgresql://127.0.0.1:26259/defaultdb?autoReconnect=true&sslmode=disable&ssl=false&reWriteBatchedInserts=true;jdbc:postgresql://127.0.0.1:26261/defaultdb?autoReconnect=true&sslmode=disable&ssl=false&reWriteBatchedInserts=true" -p db.user=root -p db.passwd="" -p db.batchsize=128  -p jdbc.fetchsize=10 -p jdbc.autocommit=true -p jdbc.batchupdateapi=true -p db.batchsize=128 -p recordcount=1000000  

bin/ycsb run jdbc -s -P workloads/workloada -p db.driver=org.postgresql.Driver -p db.url="jdbc:postgresql://127.0.0.1:26257/defaultdb?autoReconnect=true&sslmode=disable&ssl=false&reWriteBatchedInserts=true;jdbc:postgresql://127.0.0.1:26259/defaultdb?autoReconnect=true&sslmode=disable&ssl=false&reWriteBatchedInserts=true;jdbc:postgresql://127.0.0.1:26261/defaultdb?autoReconnect=true&sslmode=disable&ssl=false&reWriteBatchedInserts=true" -p db.user=root -p db.passwd="" -p db.batchsize=128  -p jdbc.fetchsize=10 -p jdbc.autocommit=true -p jdbc.batchupdateapi=true -p db.batchsize=128 -p recordcount=1000000 -p requestdistribution=uniform -p operationcount=300000 

JDBC Load Balancer

export YCSB=~/ycsb-jdbc-binding-0.17.0

cd $YCSB

cockroach sql --insecure -e "drop table if exists usertable"

java -cp lib/jdbc-binding-0.17.0.jar:lib/postgresql-42.2.4.jar site.ycsb.db.JdbcDBCreateTable -p db.driver=org.postgresql.Driver -p db.url="jdbc:postgresql://127.0.0.1:26261,127.0.0.1:26259,127.0.0.1:26257/defaultdb?autoReconnect=true&sslmode=disable&ssl=false&reWriteBatchedInserts=true&loadBalanceHosts=true" -p db.user=root -p db.passwd="" -n usertable

bin/ycsb load jdbc -s -P workloads/workloada -p db.driver=org.postgresql.Driver -p db.url="jdbc:postgresql://127.0.0.1:26261,127.0.0.1:26259,127.0.0.1:26257/defaultdb?autoReconnect=true&sslmode=disable&ssl=false&reWriteBatchedInserts=true&loadBalanceHosts=true" -p db.user=root -p db.passwd="" -p db.batchsize=128  -p jdbc.fetchsize=10 -p jdbc.autocommit=true -p jdbc.batchupdateapi=true -p db.batchsize=128 -p recordcount=1000000 -p threadcount=3

bin/ycsb run jdbc -s -P workloads/workloada -p db.driver=org.postgresql.Driver -p db.url="jdbc:postgresql://127.0.0.1:26261,127.0.0.1:26259,127.0.0.1:26257/defaultdb?autoReconnect=true&sslmode=disable&ssl=false&reWriteBatchedInserts=true&loadBalanceHosts=true" -p db.user=root -p db.passwd="" -p db.batchsize=128  -p jdbc.fetchsize=10 -p jdbc.autocommit=true -p jdbc.batchupdateapi=true -p db.batchsize=128 -p recordcount=1000000 -p requestdistribution=uniform -p operationcount=300000 -p threadcount=3

HA Proxy Load Balancer

export YCSB=~/ycsb-jdbc-binding-0.17.0

cd $YCSB

cockroach sql --insecure -e "drop table if exists usertable"

java -cp lib/jdbc-binding-0.17.0.jar:lib/postgresql-42.2.4.jar site.ycsb.db.JdbcDBCreateTable -p db.driver=org.postgresql.Driver -p db.url="jdbc:postgresql://127.0.0.1:26256/defaultdb?autoReconnect=true&sslmode=disable&ssl=false&reWriteBatchedInserts=true&loadBalanceHosts=true" -p db.user=root -p db.passwd="" -n usertable

bin/ycsb load jdbc -s -P workloads/workloada -p db.driver=org.postgresql.Driver -p db.url="jdbc:postgresql://127.0.0.1:26256/defaultdb?autoReconnect=true&sslmode=disable&ssl=false&reWriteBatchedInserts=true&loadBalanceHosts=true" -p db.user=root -p db.passwd="" -p db.batchsize=128  -p jdbc.fetchsize=10 -p jdbc.autocommit=true -p jdbc.batchupdateapi=true -p db.batchsize=128 -p recordcount=1000000 -p threadcount=3

bin/ycsb run jdbc -s -P workloads/workloada -p db.driver=org.postgresql.Driver -p db.url="jdbc:postgresql://127.0.0.1:26256/defaultdb?autoReconnect=true&sslmode=disable&ssl=false&reWriteBatchedInserts=true&loadBalanceHosts=true" -p db.user=root -p db.passwd="" -p db.batchsize=128  -p jdbc.fetchsize=10 -p jdbc.autocommit=true -p jdbc.batchupdateapi=true -p db.batchsize=128 -p recordcount=1000000 -p requestdistribution=uniform -p operationcount=300000 -p threadcount=3
Posted in cockroachdb | Tagged , , , , , , , , , | Leave a comment

Hello world!

Welcome to WordPress.com. This is your first post. Edit or delete it and start blogging!

Posted in Uncategorized | 1 Comment