Guileful BSTR strings

Let’s talk about one more nasty data type – BSTR (Basic string or binary string).

The fragment is taken from VirtualBox project. The code contains an error that analyzer diagnoses in the following way: V745 A ‘wchar_t *’ type string is incorrectly converted to ‘BSTR’ type string. Consider using ‘SysAllocString’ function.

....
HRESULT EventClassID(BSTR bstrEventClassID);
....
hr = pIEventSubscription->put_EventClassID(
                    L"{d5978630-5b9f-11d1-8dd2-00aa004abd5e}");

Explanation

Here’s how a BSTR type is declared:

typedef wchar_t OLECHAR;
typedef OLECHAR * BSTR;

At first glance it seems that “wchar_t *” and BSTR are one and the same things. But this is not so, and this brings a lot of confusion and errors.

Let’s talk about BSTR type to get a better idea of this case.

Here is the information from MSDN site. Reading MSDN documentation isn’t much fun, but we have to do it.

A BSTR (Basic string or binary string) is a string data type that is used by COM, Automation, and Interop functions. Use the BSTR data type in all interfaces that will be accessed from script.

Screenshot_2

BSTR description:

  1. Length prefix. A four-byte integer that contains the number of bytes in the following data string. It appears immediately before the first character of the data string. This value does not include the terminating null character.
  2. Data string. A string of Unicode characters. May contain multiple embedded null characters.
  3. Terminator. Two null characters.

A BSTR is a pointer. The pointer points to the first character of the data string, not to the length prefix. BSTRs are allocated using COM memory allocation functions, so they can be returned from methods without concern for memory allocation. The following code is incorrect:

BSTR MyBstr = L"I am a happy BSTR";

This code builds (compiles and links) correctly, but it will not function properly because the string does not have a length prefix. If you use a debugger to examine the memory location of this variable, you will not see a four-byte length prefix preceding the data string. Instead, use the following code:

BSTR MyBstr = SysAllocString(L"I am a happy BSTR");

A debugger that examines the memory location of this variable will now reveal a length prefix containing the value 34. This is the expected value for a 17-byte single-character string that is converted to a wide-character string through the inclusion of the “L” string modifier. The debugger will also show a two-byte terminating null character (0x0000) that appears after the data string.

If you pass a simple Unicode string as an argument to a COM function that is expecting a BSTR, the COM function will fail.

We hope this is enough to understand why we should separate the BSTR and simple strings of “wchar_t *” type.

Additional links:

  1. MSDN. BSTR.
  2. StackOverfow. Static code analysis for detecting passing a wchar_t* to BSTR.
  3. StackOverfow. BSTR to std::string (std::wstring) and vice versa.
  4. Robert Pittenger. Guide to BSTR and CString Conversions.
  5. Eric Lippert. Eric’s Complete Guide To BSTR Semantics.

Correct code

hr = pIEventSubscription->put_EventClassID(
       SysAllocString(L"{d5978630-5b9f-11d1-8dd2-00aa004abd5e}"));

Recommendation

If you see an unknown type, it’s better not to hurry, and to look it up in the documentation. This is important to remember, so it’s not a big deal that this tip was repeated once again.

Written by Andrey Karpov.
This error was found with PVS-Studio static analysis tool.

5 thoughts on “Guileful BSTR strings

  1. That Correct Code is not perfect: it creates a memory leak. A string copied by SysAllocString must be freed by SysFreeString.

    Like

    • Of course you need to clean up memory. But it goes beyond an article. It would sound strange if we tell programmers to clean up memory after its allocation. It’s obvious 🙂 It’s not a programming basics textbook.

      Like

      • Ok, I was a bit afraid of people that just copy-paste 🙂
        Nice article btw., it helped me a lot.

        Like

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.