Sunday, February 16, 2014

CSV logging in Log4J2

Full-featured CSV logging should satisfy following requirements:
  1. Dynamic CSV file header
    i.e. any CSV header change triggers automatic file rotation, and a new filename
  2. Usage of structured messages
    i.e. rather than a simple String object, message is presented as a Map<String, String>, where every key-value pair maps CSV column name and its value
  3. Dynamic mapping between message fields and the CSV header
    i.e. message carries superfluous information and it's up to the runtime configuration (.xml file) to select subset being recorded in the .log file
  4. Dynamic routing
    i.e. messages are routed dynamically to its destination file base on the value of one or more fields of the structured message
To be honest, there is still some way for Log4J2 to implement fully-featured CSV logging. As of the time of writing:
  • #2 is implemented in full
  • #3 and #4 are implemented partially 
  • #1 has been requested and documented in [2] and [3]
Feature absence, however, is not stopping Product Management from coming up with new requirements, and we will review what can be done today.

Log4J2 has two great features - EventLogger and StructuredDataMessage. Official documentation is available at [1], but in short - it gives us ability to:
  • Perform dynamic routing base on the message.type field
  • Perform logging in transactional context:
    i.e. we will accumulate messages during the transaction and flush them to the log file only if transaction is considered successful
First, our exemplary EventLogger transactional wrapper looks like:
import org.apache.logging.log4j.EventLogger;
import org.apache.logging.log4j.message.StructuredDataMessage;
import java.util.ArrayList;
import java.util.List;
/**
* @author Bohdan Mushkevych
* wraps Class holds List<StructuredDataMessage> and flushes it to the EventLogger
*/
public class EventLoggerTx {
public static final String MESSAGE_KEY = "m";
private List<StructuredDataMessage> eventPipe = new ArrayList<StructuredDataMessage>();
private boolean isTransactionOn = false;
public EventLoggerTx() {
}
/**
* Method creates an event and flushes it to the filesystem immediately
* @param body message body
* @param type message type and thus - name of the log file to record the event
*/
public void logEventDirectly(String body, String type) {
StructuredDataMessage msg = new StructuredDataMessage("", "", type);
msg.put(MESSAGE_KEY, body);
EventLogger.logEvent(msg);
}
/**
* Method creates an event and keeps it in the pipeline
* The pipeline will be flushed to the filesystem during CommitTransaction method
* @param body message body
* @param type name of the table, and thus - name of the log file to record the event
*/
public void logEvent(String body, String type) {
if (!isTransactionOn) {
logEventDirectly(body, type);
} else {
StructuredDataMessage msg = new StructuredDataMessage("", "", type);
msg.put(MESSAGE_KEY, body);
eventPipe.add(msg);
}
}
public void startTransaction() {
if (isTransactionOn) {
LogHandler.write("On StartTransaction: transaction was already ON.");
}
if (!eventPipe.isEmpty()) {
LogHandler.write("On StartTransaction: pipeline was not empty. Purging " + eventPipe.size() + " elements.");
eventPipe = new ArrayList<StructuredDataMessage>();
}
isTransactionOn = true;
}
public void commitTransaction() {
if (eventPipe.isEmpty()) {
LogHandler.write("On CommitTransaction: pipeline is empty. Nothing to flush.");
}
for (StructuredDataMessage msg: eventPipe) {
EventLogger.logEvent(msg);
}
isTransactionOn = false;
eventPipe = new ArrayList<StructuredDataMessage>();
}
public void rollbackTransaction() {
if (!isTransactionOn) {
LogHandler.write("On RollbackTransaction: transaction was already OFF.");
}
LogHandler.write("On RollbackTransaction: Purging " + eventPipe.size() + " elements.");
eventPipe = new ArrayList<StructuredDataMessage>();
isTransactionOn = false;
}
}
Secondly, the log4j2.xml file refers to message.type for routing (see the in-line comments for explanation):
<?xml version="1.0" encoding="UTF-8"?>
<!-- status The level of internal Log4j events that should be logged to the console. -->
<!-- name The name of the configuration. -->
<!-- packages A comma separated list of package names to search for plugins. -->
<Configuration status="info" name="RoutingExample" packages="">
<Properties>
<Property name="routing_filename">${log.path}/type-$${sd:type}.log</Property>
</Properties>
<Appenders>
<Console name="STDOUT">
<PatternLayout pattern="%d{yyyyMMddHH}{GMT+0} %m%n"/>
</Console>
<Routing name="Routing">
<!-- Here, $${sd:type} inspects message.type field of incoming message for later use in routing and RollingFile setup -->
<Routes pattern="$${sd:type}">
<Route>
<!-- ${log.path} is replaced by Maven during the build with absolute path to the folder containing log files -->
<RollingFile name="RollingFile-${sd:type}"
fileName="${routing_filename}"
filePattern="${log.path}/%d{yyyyMMdd}{GMT+0}/%d{yyyyMMddHH}{GMT+0}-${sd:type}-${hostName}.%i.log.gz">
<PatternLayout>
<!-- %K{m} stands for the message passed in StructuredDataMessage map with key "m" -->
<!-- %n stands for new line -->
<Pattern>%d{yyyyMMddHH}{GMT+0} %K{m}%n</Pattern>
</PatternLayout>
<Policies>
<TimeBasedTriggeringPolicy interval="1"/>
<SizeBasedTriggeringPolicy size="64 MB"/>
</Policies>
<DefaultRolloverStrategy max="999"/>
</RollingFile>
</Route>
</Routes>
</Routing>
</Appenders>
<Loggers>
<Logger name="EventLogger" level="info" additivity="false">
<AppenderRef ref="Routing"/>
</Logger>
<Root level="warn">
<AppenderRef ref="STDOUT"/>
</Root>
</Loggers>
</Configuration>
view raw log4j2.xml hosted with ❤ by GitHub
Finally, the workaround for CSV header is to store it in another file with the same message.type routing key, and later on use this separate .header file during CSV processing. For instance in example below message.type is the "tableName":

public class HeaderContext {
public static final Map<String, TableSchema> TABLE_CONTEXT = new HashMap<String, TableSchema>();
static {
// Initialize TABLE_CONTEXT with actual data
}
public static outputHeaders() {
for (String tableName: TABLE_CONTEXT.keySet()) {
PrintWriter writer = new PrintWriter(tableName + ".header", "UTF-8");
TableSchema schema = TABLE_CONTEXT.get(tableName);
writer.println(schema.getFieldNames());
writer.close();
}
}
}
Cheers!


[1] Event logging in Log4J2:
http://logging.apache.org/log4j/2.x/manual/eventlogging.html

[2] RollingFile Appender - callbacks when rolling
https://issues.apache.org/jira/browse/LOG4J2-491

[3] Allow Layout header and footer to be configured in files and to contain properties
https://issues.apache.org/jira/browse/LOG4J2-496