EFF secure messaging scorecard review

Part 2 - libotr

By Daniel Hodson / Matt Jones on 11 August 2016

TLDR: We decided to audit libotr to gauge its general maturity. Within several hours we identified a number of insecure coding practice issues and a lack of documentation that were reported. While we didn’t find any serious security vulnerabilities, our gut feeling is the code uses enough bad programming constructs to warrant further investigation before getting a tick of approval. This post shares some of our work from the audit, and also some recommendations for software security relevant to the EFF Secure IM Scorecard work.


Back in February we published part one of our EFF Secure IM Scorecard review. In the first post we introduced the scorecard and took a look at an application called RetroShare, which after a short audit revealed a number of high-impact vulnerabilities, illustrating a number of areas of improvement needed for the scorecard.

Since then, we followed up on other clients and libraries, and presented at the BSides Canberra conference here in Australia. Shortly after, the EFF took down their scorecard and announced a new version is in the works.

… hazzah! - without sugar coating it too much, this initial scorecard was broken and quite frankly, dangerous. As it’s being revised now, we’ll shift the focus of this post to look at software security outside of just a static snapshot of security bugs, based on some time reviewing libotr.

One of the reasons we wanted to look at libotr was due to the following thread related to the EFF Scorecard that explicitly calls it out with a bet on its security:

Matthew Green and I had a bet for the last year, which just ended, over libotr’s security; I bet him that nobody would find a sev:hi flaw in it all year, and, of course, won, because at this point all the low-hanging fruit in libotr has been shaken out. That bet was a reaction to the release of the EFF scorecard, which at the time gave Cryptocat(†) a perfect score but dinged ChatSecure, which is a libotr client, for not having an audit done.


Libotr is a library (C, Java) used by a number of clients to speak the OTR protocol, including Adium, ChatSecure and Jitsi natively and then Irssi, Miranda, and Pidgin via plug-in. The OTR protocol was first presented in 2004 as an improvement over OpenPGP and S/MIME, supporting forward-secrecy and deniable authentication. The protocol has gone through a few revisions after published research, most notably the papers “Secure Off-the-Record Messaging” and “Finite-State Security Analysis of OTR Version 2”.

Libotr itself has a small code-base and it is approximately 7000 lines of C and C/C++ header code (we are not focussing on the Java library in our review).

We considered the attack surface related to:

  • Vulnerabilities and weaknesses in the OTR protocol itself
  • Vulnerabilities in the libotr library implementation
  • Vulnerabilities introduced directly in libotr plugins or libotr integration natively within a client
  • Vulnerabilities in the clients that allow compromising the system or sensitive data In libotr or libotr plugins

In late February 2016 we quickly audited the libotr library and several consumers of the library using manual code review.


The repeatedly found vanilla overflow

Fairly early on we stumbled across a number of bad constructs related to memory allocation and reallocation. Unfortunately for us, there were usually mitigating circumstances that limited the exploitability of issues and could only be described as “insecure coding practice”. For example, see the following construct which includes a few of our audit notes:

/* Accumulate a potential fragment into the current context. */ 
    OtrlFragmentResult otrl_proto_fragment_accumulate(char **unfragmessagep, ConnContext *context, const char *msg) 
        if (k > 0 && n > 0 && k <= n && start > 0 && end > 0 && start < end) { <-- this may affect k==1 case below, as start/end can't be >= 0x7fffffff 
            if (k == 1) { 
                int fraglen = end - start - 1; <-- signed integer, 32bit 
                size_t newsize = fraglen + 1; <-- unsigned integer, sign extension for fraglen>=0x7fffffff, size_t result (32vs64bit)

                context->context_priv->fragment = NULL; 
                if (newsize >= 1) {  /* Check for overflow */ 
                    context->context_priv->fragment = malloc(newsize); 
                if (context->context_priv->fragment) { 
                    memmove(context->context_priv->fragment, tag + start, fraglen); 
                    context->context_priv->fragment_len = fraglen; 
                    context->context_priv->fragment[context->context_priv->fragment_len] = '\0'; <-- array indexing error etc 
            } else if (n == context->context_priv->fragment_n && k == context->context_priv->fragment_k + 1) { 
                int fraglen = end - start - 1; <-- same as above
                char *newfrag = NULL; 
                size_t newsize = context->context_priv->fragment_len + fraglen + 1; <--fragment_len+fraglen overflow

                /* Check for overflow */
                if (newsize > context->context_priv->fragment_len) { <-- d'oh
                    newfrag = realloc(context->context_priv->fragment, newsize); <-- newsize alloc, fraglen in memmove
                if (newfrag) { 
                    context->context_priv->fragment = newfrag; 
                    memmove(context->context_priv->fragment + context->context_priv->fragment_len, 
                        tag + start, fraglen); 
                    context->context_priv->fragment_len += fraglen; 
                    context->context_priv->fragment[context->context_priv->fragment_len] = '\0'; 

We spotted a number of concerns along these lines, took note, and moved on to other areas planning to come back. A few days later though, the following patch was issued:

In several places in proto.c, the sizes of portions of incoming messages were stored in variables of type int or unsigned int instead of size_t. If a message arrives with very large sizes (for example unsigned int datalen = UINT_MAX), then constructions like malloc(datalen+1) will turn into malloc(0), which on some architectures returns a non-NULL pointer, but UINT_MAX bytes will get written to that pointer.

This ended up patching at least one remotely triggerable vector. In the post-vulnerability Twitter haze, we found the following in the discussion:

Crazily, Ivan had pointed out the vanilla allocation overflow construct several months before.

The missing message.c overflow patch

The next thing we stumbled across was a potential overflow in otrl_message_symkey() via the usedatalen parameter.

We ended up Googling this function and came across the following patch dating back to late 2014:

src/message.c | 12 +++++++++---
    1 file changed, 9 insertions(+), 3 deletions(-)

    diff --git a/src/message.c b/src/message.c
    index 68ee9e7..333ee0c 100644
    --- a/src/message.c
    +++ b/src/message.c
    @@ -22,6 +22,7 @@
    /* system headers */
    #include <stdio.h>
    #include <stdlib.h>
    +#include <stdint.h>
    #include <time.h>
    /* libgcrypt headers */
    @@ -1923,17 +1924,22 @@ gcry_error_t otrl_message_symkey(OtrlUserState us,
        unsigned int use, const unsigned char *usedata, size_t usedatalen,
        unsigned char *symkey)
    -    if (!context || (usedatalen > 0 && !usedata)) {
    -   return gcry_error(GPG_ERR_INV_VALUE);
    +    if (!context || (usedatalen > 0 && !usedata) || usedatalen > SIZE_MAX-4) {
    +        return gcry_error(GPG_ERR_INV_VALUE);
        if (context->msgstate == OTRL_MSGSTATE_ENCRYPTED &&
            context->context_priv->their_keyid > 0) {
    -   unsigned char *tlvdata = malloc(usedatalen+4);
    +        unsigned char *tlvdata = NULL;
        char *encmsg = NULL;
        gcry_error_t err;
        OtrlTLV *tlv;
    +        tlvdata = malloc(usedatalen+4);
    +        if (!tlvdata) {
    +            return gcry_error(GPG_ERR_ENOMEM);
    +        }

Interestingly, this patch never appeared to make it into the main branch, so we shot over an email to the developers which they promptly replied to.

This parameter is only supplied by the users client and not directly over the network (like the previous finding), and so a patch would only be defense in depth - nonetheless it was planned for merging.

A glance at plugin implementations

For a break we decided to spend a bit of time glancing over some clients using the library. While looking at irssi-otr, we came across the following in ennqueue_otr_fragment:

static enum otr_msg_status enqueue_otr_fragment(const char *msg, struct otr_peer_context *opc, char **full_msg)
        enum otr_msg_status ret;
        size_t msg_len;

        /* We are going to use it quite a bit so ease our life a bit. */
        msg_len = strlen(msg); <-- [1]
        if (opc->full_msg) {
        } else {
            char *pos;
             * Try to find the OTR message tag at the _beginning_of the packet and
             * check if this packet is not the end with the end tag of OTR "."

            pos = strstr(msg, OTR_MSG_BEGIN_TAG);
            if (pos && (pos == msg) && msg[msg_len - 1] != OTR_MSG_END_TAG) {

                /* Allocate full message buffer with an extra for NULL byte. */
                opc->full_msg = zmalloc((msg_len*2) + 1);   <-- [2]

                if (!opc->full_msg) {
                    ret =OTR_MSG_ERROR;
                    goto end;

                /* Copy full message with NULL terminated byte. */
                strncpy(opc->full_msg, msg, msg_len); <-- [3]
                opc->msg_len += msg_len;
                opc->msg_size += ((msg_len * 2) + 1);
                opc->full_msg[opc->msg_len] ='\0';

In this instance, [1] would need to be 2^31 (2GB), and then [2] would wrap past the 2^32 width of 4GB and result in an integer overflow. The developers acknowledged this as a possible issue and said a patch would be issued (although low priority due to new irssi branches supporting OTR natively).

Is it thread safe? (Yes/No/Maybe): M

Before we closed our time looking at this project - we came across possible issues when libotr is used by multithreaded clients (e.g. ChatSecure - although it does use synchronous queues). We raised the question “is libotr threadsafe?” in one of our reports, and one developer replied it should be.

In the code the easiest way to spot multithreading issues is looking for cases where the library allocates and uses a shared control structure, such as ConnContext, and where no locking mechanisms are available. This would result in race conditions such as the following in proto.c:

/* Accumulate a potential fragment into the current context. */  
    OtrlFragmentResult otrl_proto_fragment_accumulate(char **unfragmessagep,  
        ConnContext *context, const char *msg)  
        if (k > 0 && n > 0 && k <= n && start > 0&& end > 0 && start < end) {  
            if (k == 1) {  
                size_t fraglen = end - start - 1;  
                size_t newsize = fraglen + 1;  

                free(context->context_priv->fragment); [2] Thread.b frees fragment  
                context->context_priv->fragment = NULL;  
                if (newsize >= 1) { /* Check for overflow */  
                    context->context_priv->fragment = malloc(newsize); [1] Thread.a allocates fragment  
                if (context->context_priv->fragment) {  
                    memmove(context->context_priv->fragment, tag + start, fraglen); [3] Thread.a copies into unallocated memory 
                    context->context_priv->fragment_len = fraglen;  
                    context->context_priv->fragment_len] = '\0';  
                    context->context_priv->fragment_n = n;  
                    context->context_priv->fragment_k = k;  
                } else {  
                    context->context_priv->fragment_len = 0;  
                    context->context_priv->fragment_n = 0;  
                    context->context_priv->fragment_k = 0;  

The following shows triggering an access violation callstack of this multihreaded issue:

[daniel@localhost unit]$ libtool --mode=execute gdb test_proto -q  
    Reading symbols from /home/daniel/tmp/libotr/tests/unit/.libs/lt-test_proto...done.  
    (gdb) set print pretty on  
    (gdb) r  
    Thread 7 "lt-test_proto" received signal SIGSEGV, Segmentation fault.  
    [Switching to Thread 0x7ffff4907700 (LWP 32442)]  
    otrl_proto_fragment_accumulate (unfragmessagep=0x7ffff4906f08, context=0x5555557577e0, msg=<optimized out>) at proto.c:932 
    932   context->context_priv->fragment_len] = '\0';  
    (gdb) p *context->context_priv  
    $1 = {  
        fragment = 0x0,  
        fragment_len = 0,

After we raised this, we had clarification from another developer that everything (other than key generation) needs to be called from a single thread.

No, that’s not true at all. There’s one bit of functionality (key
generation, carefully documented) that allows you to do that one thing
in a separate thread, but everything else in libotr needs to be called
from one thread.


With many bugs all eyes are shallow

Software security really cannot be gauged via what vulnerabilities have been publicised, regardless of bets or bounties.

While focusing on the susceptibility to various vulnerabilities is important, another important element to software security is the readiness for responding to potential security issues.

The following are some general recommendations for those running open-source software projects:

  • Establish a process for security advisories and incidents - i.e. provide clear channels and guidelines and ensure a dedicated encrypted point of contact (e.g. email and PGP)
  • Ensure potential security issues are managed efficiently - crashes, added validation checks, etc. should be correctly flagged, reviewed, restricted, and prioritised in your bug tracking system.
  • Build a consistent approach to apply defensive programming - it can establish the ability to add a resilience to various vulnerability classes, but does need a little time to mature. Quite often software is written with assumptions on the trust boundaries and data flows, which may be assumptions that are either invalid or may even change over time (e.g. the next release)
  • Always assume exploitable - it’s 2016 and time and time again we’ve seen the same discussions on exploitability. Often dangerous constructs exist in code for sometime, and either they seem to just sit there until someone takes the time/effort to find and exploit it publicly, or the vectors aren’t immediately clearly understood. Having a habit to eradicate dangerous constructs and play defensive is a lot easier and may save the headache of patching periodic high severity vulnerabilities.

The importance of documentation

Libotr is a library, and like all libraries, there are edge-cases and behaviours that can have security consequences. It is inevitable for there to be quirks to the developers when interfacing with API’s, and as such clear documentation is particularly important.

For example, man pages covering libc functions detail not only the appropriate usage of the function, but any edge-cases that can lead to potential security issues, such as the following for the *nprintf() family:

The functions snprintf() and vsnprintf() do not write more than size bytes (including the terminating null byte (‘\0’)). If the output was truncated due to this limit then the return value is the number of characters (excluding the terminating null byte) which would have been written to the final string if enough space had been available. Thus, a return value of size or more means that the output was truncated. (See also below under NOTES.)

If an output error is encountered, a negative value is returned.

The following are some general recommendations for those running open-source software projects:

  • Have a centralised place that clearly documents interfaces - this should cover all known edge-cases and important universal considerations (e.g. is this code thread safe?). Ideally, this should have a form of security checklist of things to be careful of.
  • Performing basic threat modeling of your project - the objective is to document and share an understanding amongst developers and people auditing the software. Understanding the data flows, privilege levels, assets and components in-scope is a good starting point. Start really, really simple and then this can build up over time.

Compounded attack surface

While libotr is quite small, it’s built to be used by IM clients, some of which support an immense range of functionality with extensive attack surface. Without a security conscious design, any vulnerabilities and weaknesses can have a lower effort to attack with a higher impact.

As such, the concept of incorporating security mitigations by default is quite important for any software with high security requirements.

In the Chromium sandbox FAQ they have the following:

This is very neat. Can I use the sandbox in my own programs?

Yes. The sandbox does not have any hard dependencies on the Chromium browser and was designed to be used with other Internet-facing applications. The main hurdle is that you have to split your application into at least two interacting processes. One of the processes is privileged and does I/O and interacts with the user; the other is not privileged at all and does untrusted data processing.

Isn’t that a lot of work?

Possibly. But it’s worth the trouble, especially if your application processes arbitrary untrusted data. Any buffer overflow or format decoding flaw that your code might have won’t automatically result in malicious code compromising the whole computer. The sandbox is not a security silver bullet, but it is a strong last defense against nasty exploits.

The following are some general recommendations for those running open-source software projects:

  • Research how to apply defense-in-depth principles - i.e. sandboxing and privilege separation
  • Ensure your code is compiled with the latest security features to leverage security protections from your environment/platform – e.g. Windows and Linux


As the EFF scorecard is offline and being revised, this will be our last post on the subject for now. There are many, many considerations for gauging software security, and we hope future scorecards take the time to properly consider the breadth and depth of factors for it to be reasonable.

Daniel specialises in code review, penetration testing and developer education.

Matt specialises in threat modelling, code review and data analysis.

They both have a decade+ of experience hacking things.