Dynamic tracing of D-Bus

It came to my mind one day while I was troubleshooting D-Bus related issues that it might be interesting to use  DTrace for the task. For the uninitiated, D-Bus is a message bus system used in many desktop environments such as GNOME and KDE for inter-process communication. What I really wanted is to trace incoming and outgoing messages. Ideally, the way to proceed would be to add static probes inside the D-Bus code. But I preferred not to modify the code and build scripts of D-Bus just for a simple experiment. As such, this blog entry will describe my quest at tracing and decoding D-Bus messages using the pid provider. For reference, I use version 1.4.1 for this entry, due to the inherent instability of internal interfaces probed using the pid provider, the scripts provided in this blog may or may not work with other versions.

Assuming one only wants statistics about when messages are sent or received is easy, it's only a matter of finding a function that is called once per message sent or received. Decoding the message is a completely different manner though. What I really need here is the message header, the data itself is of little interest at this level of tracing. I elected to start by trying to catch a message sent from a client application using libdbus. The full message writing tracing script can be seen in libdbus-write.d.

Tracing writes

Diving straight into the code, dbus-connection.c briefly explains the inner working of a D-Bus connection. In an extremely oversimplified manner, the process of sending a dbus message first involves building the message and then adding it to a queue with an optional timeout for reply. This queue is implemented internally using a linked list of messages to be sent. The same queue is used to manage reply timeouts I assume. Queuing a message is done by different flavors of the function dbus_connection_send() in dbus-connection.c. A main loop is then responsible of managing messages to be sent as well as checking timeouts, receiving replies etc. As part of the main loop message reads and writes are handled by the transport layer of D-Bus. One important function for us is the _dbus_transport_do_iteration() function implemented in dbus-transport.c. This function performs a single iteration of reading and writing data. From the function code we can see that a transport->vtable structure is used to call another do_iteration function. It is the transport abstraction layer of D-Bus. Since D-Bus supports more than one transport type function pointers are used to select between different implementations at run time. In our case only one transport is of interest and it is implemented in dbus-transport-socket.c.

Here we can find the socket_do_iteration() function, and in it a call to do_writing() implemented in the same file. We are getting close, however the argument of do_writing() containing the message is a DbusTransport struct which is rather complicated and contains many other nested structures including the DbusMessage struct which is the message proper. A bit later in the function _dbus_connection_get_message_to_send() is called which extracts the message from the DbusTransport structure. Now, we could choose to use that to trace the message, but there is an issue. The DbusMessage struct is complex and from looking at its definition in dbus-message-private.h it depends on how it was compiled. And really all that is needed is the header. Fortunately, just a few lines later is the call to the function that saves the day, _dbus_message_get_network_data() which extracts the network ready header and data. More importantly this data is in network format and as such its format is guaranteed by the D-Bus specification.

Without further waiting let's start writing a D scripts since this won't fit in a one-liner. First the begin probe to display a nice message that tracing has started :

#!/usr/sbin/dtrace -s

#pragma D option quiet

BEGIN
{
  printf( "Ready! Press CTRL-C to abort.\n" );
}

Then, what we want is the _dbus_message_get_network_data() function, however it is used somewhere else in the code. So to be certain that we are tracing the right function let's use a thread local variable that will form the basis of the predicate for the following tracing clauses :

pid$target:libdbus-1.so:do_writing:entry
{
  self->in_write = 1;
}

Next is the _dbus_message_get_network_data() function with the following implementation :
void

_dbus_message_get_network_data (DBusMessage          *message,
                                const DBusString    **header,
                                const DBusString    **body)
{
  _dbus_assert (message->locked);

  *header = &message->header.data;
  *body = &message->body;
}

What we need is the second argument after it was assigned, but we must save it at the function entry first as it won't be available in the return probe.

pid$target:libdbus-1.so:_dbus_message_get_network_data:entry
/ self->in_write == 1 /
{
  self->header_ref = arg1;
}

The data can then be read from the return probe. From the code, we see that the header argument is passed by reference. So the argument is really a pointer to a pointer to the DbusString struct containing the header. To dereference a user space pointer in DTrace one need to first copyin() the pointer value, then perform a second copyin() using the previously copied data like this:

pid$target:libdbus-1.so:_dbus_message_get_network_data:return
/ self->in_write == 1 /
{
  /* Copy the second function argument of type header** */
  this->header_ptr = *(uintptr_t*)copyin( self->header_ref, sizeof(uintptr_t) );
  self->ptr = copyin( this->header_ptr, sizeof(uintptr_t) );
...

Now since the DbusString contains a C string doing yet another copyin() at the self->ptr location would start reading the header data. This is where the D-Bus protocol specification come in handy. At some point in the specification the different D-Bus types along with their encoding are described. This will be used to interpret the header data. Readers may want to read that before continuing since we will be referring to those data types from now on.

A D-Bus header has a wire format defined as follow :

BYTE, BYTE, BYTE, BYTE, UINT32, UINT32, ARRAY of STRUCT of (BYTE,VARIANT)

The first four bytes are respectively the endianness, message type, flags and the protocol version of the encoded message. The two following unsigned 32 bit integers (D-Bus type UINT32) are the length of the message body and the message serial used to match replies to previously sent messages. This is then followed by a variable number of fields represented as an ARRAY of D-Bus structures each containing a BYTE type to identify the field and a VARIANT type that contains the field data. The exact data type contained in the VARIANT depends on the field. At this point one thing that can be said about the header is that it is of a minimal length of 16 bytes. 12 bytes for the static header data plus 4 additional bytes for the length of the ARRAY which is always there even if the ARRAY is empty.

The script is divided by splitting the clauses across multiple probes all tracing the same function. The first of which will read the initial static section of the header. Thread local variables are used to keep the fields data. Note that I'm type-casting the void pointer this->h_p to an uint8_t* to perform pointer arithmetic on it to extract data from the continuous 16 bytes array residing in scratch memory. (I have a strong personal repulsion against the “char” type)

pid$target:libdbus-1.so:_dbus_message_get_network_data:return
/ self->in_write == 1 /
{
  /* Copy the second function argument of type header** */
  this->header_ptr = *(uintptr_t*)copyin( self->header_ref, sizeof(uintptr_t) );
  self->ptr = copyin( this->header_ptr, sizeof(uintptr_t) );

  /* What we just copied was a pointer to the real data so let's copyin again.
   * Technically the static header data is 12 bytes but the first field */
  this->h_p = copyin( *(uintptr_t*)self->ptr, 16 );

  self->endian_field = ((uint8_t*)this->h_p)[0];
  self->type_field = ((uint8_t*)this->h_p)[1];
  self->flags_field = ((uint8_t*)this->h_p)[2];
  self->version_field = ((uint8_t*)this->h_p)[3];
  self->length_field = *(uint32_t*)((uint8_t*)this->h_p + 4);
  self->serial_field = *(uint32_t*)((uint8_t*)this->h_p + 8);

  self->array_length = *(uint32_t*)((uint8_t*)this->h_p + 12);

  /* Copy the whole array once to the scratch buffer. */
  self->arr = copyin( *(uintptr_t*)self->ptr, 16 + self->array_length );

  /* Position in the header array, we start processing at 16 */
  self->pos = 16;

  self->in_write = 0;
}

The last part of the read data is stored in self->array_length. This is the total length excluding leading and lagging padding of the ARRAY data. This effectively gives us the total amount of bytes that must be read to get what's left of the header. The entire header is read into self->arr for convenience and we initialize a position variable that we will use to keep track of how much data is left to decode and to calculate required padding between fields. Finally, self->in_write is assigned 0 since the remaining clauses will use the array length and position as predicate.

Now come the fun part. DTrace must be used to interpret a collection of bytes of arbitrary length that contains zero or more fields in any order which can have different formats. One may think that the lack of loop construct and more importantly without a simple if is a rigid road block to this task. Nonsense, it is quite possible with a bit of byte juggling and the use of advanced types like associative arrays available in DTrace. Oh, and a healthy dose of the ternary ? operator.

What has to be decoded is a STRUCT of (BYTE,VARIANT). A STRUCT starts at the next 8 byte aligned location and contains its constituent types one after the other interleaved with the necessary padding. The strategy used to decode the fields is to create a clause that can decode any types available and we will copy that clause 9 times for the current number of header fields possible. Each identical clause will check that there is data remaining to be decoded. As said earlier this is for the D-Bus version 1.4.1, the protocol that it implements has 9 different header fields defined and those fields use one of four D-Bus types. The decoding will assume that this is the case which may not be true for future versions of the protocol.

STRUCTs are always 8 bytes aligned, so any padding is accounted for at the start of the clause. This is however not required for the first field since it will always start aligned. Here, the padding calculation is done using the rather classic modulo form.

pid$target:libdbus-1.so:_dbus_message_get_network_data:return
/ self->array_length - self->pos + 16 > 0 /
{
  self->pos = self->pos + (8 - (self->pos % 8)) % 8;
...

We can then read the beginning of the STRUCT which is a BYTE type containing the field type followed by the length of the VARIANT type then its SIGNATURE. Type signatures in D-Bus can be rather complex, describing compound types and structures, but we know that only four base types are used for the header and this type can be deducted from the field type read at the beginning of the clause. For simplicity the SIGNATURE is read using stringof() and the resulting string length is added to the position indicator plus one for the nul byte.

...
  this->type = (uint32_t)((uint8_t*)self->arr)[self->pos++];
  this->variant_len = ((uint8_t*)self->arr)[self->pos++];
  this->variant_type = stringof( ((uint8_t*)self->arr + self->pos) );

  self->pos = self->pos + strlen( this->variant_type ) + 1;
...

The position indicator is now at the start of the field data. The four possible types to be read are STRING, OBJECT_PATH, SIGNATURE and UINT32. However OBJECT_PATH and STRING have the exact same encoding so we are left with three possible cases. Instead of doing a mess out of the ternary operator at this point it's possible to perform three different decoding for the three cases and then only keep the correct one. It may seems dangerous to use stringof() on unidentified data but remember that this header is contained in a DbusString which means it's always null terminated.

Of the three types, the SIGNATURE type starts unaligned so the possible SIGNATURE type is stored directly in this->gdata. The two others types are 4 bytes aligned so we calculate the required padding. For the STRING and OBJECT_PATH types there is a 4 bytes UINT32 that holds the length of the contained string, this is skipped and then read with stringof(). Finally the simple UINT32 is decoded.

...
  this->gdata = stringof( ((uint8_t*)self->arr + self->pos + 1) );

  self->pos = self->pos + (4 - (self->pos % 4)) % 4;

  this->sdata = stringof( ((uint8_t*)self->arr + self->pos + 4) );
  this->udata = *(uint32_t*)((uint8_t*)self->arr + self->pos);
...

The new position where the next field starts depends on the real type of the field currently decoded. In a similar manner to the data the three possible offsets are evaluated and stored separately in an associative array. Why this is done will be clear soon.

...  
  self->offset[0] = 4;
  self->offset[1] = 4 + strlen(this->sdata) + 1;
  self->offset[2] = 1 + strlen(this->gdata) + 1;
...

We then need to choose the right decoding depending on the type of the field, but we must also store the result with the corresponding field. Storing the result is actually very simple, an associative array can be used keyed by the field type stored in the self->type variable. Two different arrays are used since we either store strings or a 32 bit integer. The hypothetical UINT32 can be stored directly. This is possible because the data is keyed by the type. So only key [5] and [9] will (possibly) contain valid data, all other keys will be ignored and may contain garbage.

...
  self->uint_field[this->type] = this->udata;
... 

For the string field we need to check if we must assign either the SIGNATURE or the STRING. This is done using the ? ternary operator using an is_string[] array that translates the field type to the data type.

...
  self->string_field[this->type] = is_string[this->type] == 1 ? this->sdata : this->gdata;
...

The is_string[] array is filled in the BEGIN probe with 0 meaning a UINT32, 1 a STRING and 2 a SIGNATURE.

...
  is_string[1] = 1;
  is_string[2] = 1;
  is_string[3] = 1;
  is_string[4] = 1;
  is_string[5] = 0;
  is_string[6] = 1;
  is_string[7] = 1;
  is_string[8] = 2;
  is_string[9] = 0;
...

Finally the right offset is added to self->pos using the is_string[] array again :

...
  self->pos = self->pos + self->offset[is_string[this->type]];
...

We can then move on to decoding the next field.

At the end of the nine decoding clauses a final clause can be used to print the data. This is rather simple and shown in the full libdbus-write.d script. An example of output is shown here tracing the Audacious music player sending signals about its state changes.

# dtrace -s libdbus-write.d -p `pgrep audacious`
Ready! Press CTRL-C to abort.
signal : path=/Player; interface=org.freedesktop.MediaPlayer; member=StatusChange; error=; dest=; sender=; sig=(iiii); serial=0; fd=0;
signal : path=/Player; interface=org.freedesktop.MediaPlayer; member=StatusChange; error=; dest=; sender=; sig=(iiii); serial=0; fd=0;
signal : path=/Player; interface=org.freedesktop.MediaPlayer; member=StatusChange; error=; dest=; sender=; sig=(iiii); serial=0; fd=0;
signal : path=/Player; interface=org.freedesktop.MediaPlayer; member=TrackChange; error=; dest=; sender=; sig=a{sv}; serial=0; fd=0;
signal : path=/Player; interface=org.freedesktop.MediaPlayer; member=StatusChange; error=; dest=; sender=; sig=(iiii); serial=0; fd=0;
signal : path=/TrackList; interface=org.freedesktop.MediaPlayer; member=TrackListChange; error=; dest=; sender=; sig=i; serial=0; fd=0;

Tracing reads

Catching incoming messages is a bit more complex. There is a do_reading() function similar to do_writing() but a read can be done in many incomplete chunks. Added to that the possibility of corrupted messages, the do_reading() function is not a very good place to trace for incoming messages. Read messages a reconstructed after being read and then checked for correctness. Afterwards it's added to the queue of received messages with possibly the clearing of a timeout timer if it was an awaited reply. Messages are loaded by load_message(), however that function lies in the same compilation unit where it is used so it's often inlined and thus, not a good target for tracing. However, there is another magical function that will save us all from the pain of decoding the complex transport structure, _dbus_header_load() implemented in dbus-marshal-header.c. Here's the function definition :

dbus_bool_t
_dbus_header_load (DBusHeader        *header,
                   DBusValidationMode mode,
                   DBusValidity      *validity,
                   int                byte_order,
                   int                fields_array_len,
                   int                header_len,
                   int                body_len,
                   const DBusString  *str,
                   int                start,
                   int                len)
{
...

A DbusHeader is passed by reference, fortunately the first field of that struct is a DbusString with the header in it. The decoding can be done in exactly the same way as outgoing messages, the complete script is libdbus-read.d. We can also combine both the read and write scripts since the decoding clauses are exactly the same and DTrace allows multiple probes per clause like in libdbus-rw.d.

Here's and example running it on Xchat while receiving an IRC message with pop-up enabled. The first message is sent by Xchat (libnotify to be exact) to the XDG specified Notifications interface. This is received by the notification-daemon in this case who replies having received the message. Finally the notification daemon send a signal informing interested clients that the pop-up was closed (Either by the user or from timing out).

# dtrace -s libdbus-rw.d -p `pgrep xchat`
Ready! Press CTRL-C to abort.
(sent) method call : path=/org/freedesktop/Notifications; interface=org.freedesktop.Notifications; member=Notify; error=;
 dest=org.freedesktop.Notifications; sender=; sig=susssasa{sv}i; serial=0; fd=0;
(recv) method return : path=/org/freedesktop/Notifications; interface=org.freedesktop.Notifications; member=Notify; error=; dest=:1.117;
 sender=:1.46; sig=u; serial=14; fd=0;
(recv) signal : path=/org/freedesktop/Notifications; interface=org.freedesktop.Notifications; member=NotificationClosed; error=; 
dest=:1.117; sender=:1.46; sig=uu; serial=14; fd=0;

Tracing the session bus

The same scripts can be used to trace the session bus, however the code is linked statically inside the dbus-daemon. So to trace the daemon the probes objects must be changed from libdbus-1.so to either dbus-daemon or a.out like in dbus-rw.d. Of course one can also modify the scripts to perform some basic statistics gathering, for example the dbus-stats.d script will output every 30 seconds the total messages sent and received followed by a breakdown by the type of message.

 Total:
   read                                                           133
   write                                                          199
 Sent:
   method return                                                  23
   method call                                                    33
   error                                                          37
   signal                                                         106
 Recv:
   signal                                                         6
   method return                                                  12
   error                                                          21
   method call                                                    94

Conclusion

It's nice to see that we were able to decode the headers without having to resort to USDT probes. Of course the provided scripts would probably break on newer or older implementation of D-Bus but they could be adapted if need be.

Full source : dbus-dtrace.tar.gz

Gotchas :

  • If you encounter an error at script launch time about a failed lookup it may means that the binary were stripped in which case symbols are not available and tracing of most functions is not possible. This is much more of a problem with the daemon.
  • The session is generally rather quiet with hardly more than a few messages per seconds in average, with occasional short burst for example when refreshing a folder in nautilus. However, under sustained heavy traffic DTrace may not be able to process traces output quickly enough and will issue drops warnings. The section on buffers and buffering of the DTrace guide explains dropping in details.

References and recommended readings

Oracle Solaris Dynamic Tracing Guide: http://wikis.sun.com/display/DTrace/Documentation
D-Bus Specification : http://dbus.freedesktop.org/doc/dbus-specification.html

Comments

Post new comment

The content of this field is kept private and will not be shown publicly.