Presto is a fast distributed SQL query engine for big data. I wrote a more introductory and up and running post a while back.
Presto users frequently [1, 2, 3, 4] want the ability to log various details regarding queries and execution information from Presto. This is very useful for operationalizing presto in any organization. Logging query details allows a team to understand the usage of Presto, provide operational analytics and identify on performance bottlenecks. If you want to know how to achieve this, read on. You can also use this guide to learn how to implement any presto plugin. All the code used in this post is available.
Event Listeners
One of the best thing about Presto’s design is clean abstractions. Event Listener is one such abstraction. Presto added Event Listener support some time back, similar to other engines. Presto Event Listener allows you to write custom functions that listens to events happening inside engine and react to it. Event listeners are invoked for following events in presto query workflow :
- Query creation
- Query completion
- Split completion
Couple of caveats regarding Event Listeners in Presto:
- In a given presto cluster, you can only register a single event listener plugin.
- Each presto event listener is a presto plugin. So, it will behave like one - in terms of how is it registered and so on.
So, to crate a query logging presto plugin, at a high level, we will,
Implement an
EventListener
and anEventListenerFactory
interfaces from Presto.Make sure to package our classes and register the plugins so that Presto can find them.
Deploy the plugin to Presto.
If these names dont make sense right now, don’t worry. We will go through detailed step by step instructions below.
Implementation
We will use Maven for dependency management and packaging. Set it up and create an empty maven project.
Add Presto as dependency by adding the following into
project
section of thepom.xml
file.<dependencies> <dependency> <groupId>com.facebook.presto</groupId> <artifactId>presto-spi</artifactId> <version>0.172</version> <scope>provided</scope> </dependency> </dependencies>
We will also use slf4j for logging the query details. Note that we used
logger
interface for logging the queries because it provides maximal flexibility on choosing where/how to store logs. So lets add that as a dependency as well<dependency> <groupId>org.slf4j</groupId> <artifactId>slf4j-log4j12</artifactId> <version>1.7.16</version> </dependency>
Time to write code. We will start by creating QueryFileLoggerEventListener class that implements Presto’ EventListener interface.
public class QueryFileLoggerEventListener implements EventListener{ static final Logger logger = LoggerFactory.getLogger(QueryFileLoggerEventListener.class); public QueryFileLoggerEventListener(Map<String, String> config){} public void queryCompleted(QueryCompletedEvent queryCompletedEvent) { logger.info(queryCompletedEvent.getMetadata().getQueryId() + " : " + queryCompletedEvent.getMetadata().getQueryState() + " : " + queryCompletedEvent.getMetadata().getQuery() + " : " + queryCompletedEvent.getStatistics().getTotalRows() + " : " + queryCompletedEvent.getStatistics().getTotalBytes() + " : "); } }
Here we are logging query details(
QueryId
,State
,Query
) and Statistics(TotalRows
,TotalBytes
) separating them via:
in a single log line. Note that here for the space reason I am showing only few query details, event object contains a lot of other useful information. For example, you can use theState
to determine if query failed or succeeded and log different details in each case. The code in the repo logs additional details.Also, note that we have only implemented
queryCompleted()
method from the EventListener interface. It providesqueryCreated()
andsplitCompleted()
methods for query creation and split completion event notifications.Now lets create a QueryFileLoggerEventListenerFactory class that implements Presto’s EventListenerFactory interface.
public class QueryFileLoggerEventListenerFactory implements EventListenerFactory { public String getName() { return "event-logger"; } public EventListener create(Map<String, String> config) { return new QueryFileLoggerEventListener(config); } }
Here we are creating a minimal implementation of the factory method that is just invoking our listener. If you need to perform any additional initialization, you can add it here. Also, note that we are naming our logging listener
event-logger
, which we will use later when configuring it.As noted previously, event listeners are registered as a plugin in Presto. So, lets create QueryFileLoggerPlugin which implements Presto’s Plugin
public class QueryFileLoggerPlugin implements Plugin { public Iterable<EventListenerFactory> getEventListenerFactories() { EventListenerFactory listenerFactory = new QueryFileLoggerEventListenerFactory(); return ImmutableList.of(listenerFactory); } }
Here we are again simply registering our factory as part of the plugin.
Packaging
Now, that we have all the code, lets move to packaging it.
Presto uses Service Provider Interfaces(SPI) to extend Presto. SPI is widely used in Java world. Presto uses SPI to load Connector, Functions, Types and System Access Control. SPI are loaded via metadate files. We will create
src/main/resources/META-INF/services/com.facebook.presto.spi.Plugin
metadata file. The file should contain the class name for our plugin -QueryFileLoggerPlugin
.We will also add
log4j.properties
file that specifies where to write our query logs. You should adopt this to your environment.Lets compile and package our code.
mvn package
Deployment
At this stage, we have our code ready to deploy to Presto.
First we will have to tell Presto to load our listener. We will create event-listener configuration file
<path-to-presto>/etc/event-listener.properties
. This configuration file at-least should haveevent-listener.name
property whose value should match the string returned byEventListenerFactory.getName()
- in out caseevent-logger
. The remaining properties will be passed as a map toEventListenerFactory.create()
which can use for passing any additional information you want to your listener.Copy our generated jar to the presto plugins directory.
cp target/presto-event-logger*.jar <path-to-presto>/plugin/event-logger/
You should also copy
slf4j-api-*.jar
,slf4j-log4j12-*.jar
,guava-*.jar
,log4j-*.jar
or any additional dependencies that you have to the event-logger folder<path-to-presto>/plugin/event-logger/
.We are all set. Start the presto server
<path-to-presto>/bin/launcher start
You should see the event listener registration in the Presto server logs. And you should also see your query logs as the queries are submitted to Presto.
That’s it. We saw how to write a query logger plugin for presto. As noted above, complete code is available. Give it a try!