Recording Skype calls. Part 1. Skype API and sound processing

skype

Recording Skype calls may be very useful, especially when we discuss some business questions or, for example, approaches related to the project issues. Also listening to records may help to improve knowledge of the English language, correct mistakes.

I decided to try to develop an application that will use Skype API and record a conversation to the MP3-file.

I spent a few evenings and figured out one interesting thing about Skype: it is able to record sound without any additions! All we need is to send a Skype API command that will redirect sound channels to the files! Of course, there are some limitations:

  • the format of redirected sound is WAV,
  • Skype redirects channels separately from the microphone and from speakers.

So on the output we have two huge files (since WAV doesn’t have compression), which is inconvenient to store and listen to.

To create a Skype recorder we should write some wrapper for Skype API and “class-handler” that will send and receive messages from Skype.

It is also possible to use the official Skype4COM library instead of implementing my own wrapper, but it was interesting for me to do it myself.

Below is the small tutorial how to use Skype API.

Connecting to Skype and "subscribing" to events

First of all, we need to register discover and attach Windows API messages since Skype uses them for communication:

[DllImport("user32.dll")]
public static extern uint RegisterWindowMessage(string message);

skypeApiDiscover = RegisterWindowMessage("SkypeControlAPIDiscover");
skypeApiAttach = RegisterWindowMessage("SkypeControlAPIAttach");

Now, to attach to the Skype, we need to send an attachment request via Windows API broadcast message. As a parameter we must provide discover message ID skypeApiDiscover and window handle that will receive further messages from Skype.

[DllImport("user32.dll")]
private static extern IntPtr SendMessageTimeout(
    IntPtr hWnd,
    uint msg,
    IntPtr wParam,
    ref CopyDataStruct lParam,
    uint flags,
    uint timeout,
    out IntPtr result);

SendMessageTimeout(
    new IntPtr(-1),
    skypeApiDiscover,
    ourWindowHandle,
    IntPtr.Zero,
    IntPtr.Zero,
    100,
    out result);

Skype will show prompt for attachment request:

skype

After clicking one of the buttons Skype sends a Window API message to the window which handle we have specified. The window must contain WndProc in order to process the message.

public IntPtr WndProc(
    IntPtr hWnd,
    int message,
    IntPtr wParam,
    IntPtr lParam,
    ref bool handled)
{
    // ...
}

As an attachment result Skype sends a message with message == skypeApiAttach and lParam == 0 in case of success. Parameter wParam contains Skype window handle that we will use to send commands. If the attachment was done then WndProc will start receiving messages for all Skype events. To filter them we can check that message is WM_COPYDATA and wParam == skypeWindowHandle. The information about the event is stored in the lParam (COPYDATASTRUCT structure). The COPYDATASTRUCT can be implemented in the following way:

[StructLayout(LayoutKind.Sequential)]
internal struct CopyDataStruct
{
    public string Id;
    public int Size;
    public string Data;
}

Field Data contains Skype message data. For example:

CALL 1234 STATUS INPROGRESS

where 1234 – is the unique ID of the call. It must be used in the other Skype commands to have control over this conversation.

Sending commands to Skype

For sending certain Skype command we need to prepare CopyDataStruct and send it by using SendMessageTimeout() Windows API function:

[DllImport("user32.dll")]
private static extern IntPtr SendMessageTimeout(
    IntPtr windowHandle,
    uint message,
    IntPtr wParam,
    ref CopyDataStruct lParam,
    SendMessageTimeoutFlags flags,
    uint timeout,
    out IntPtr result);

private void sendSkypeCommand(string command)
{
    var data = new CopyDataStruct
               {
                   Id = "1",
                   Size = command.Length + 1,
                   Data = command
               };

    IntPtr result;
    SendMessageTimeout(
        skypeWindowHandle,
        WM_COPYDATA,
        ourWindowHandle,
        ref data,
        SendMessageTimeoutFlags.Normal,
        100,
        out result);
}

For example, if we want to redirect the current conversation sound to the file, we should use two Skype commands:

  • ALTER CALL {0} SET_OUTPUT FILE="{1}" - redirecting speakers sound, that we hear from our conversation partner (in Skype terms);
  • ALTER CALL {0} SET_CAPTURE_MIC FILE="{1}" – redirecting our microphone, what we say to our conversation partner.

The fragments of the code that send these commands:

public void RedirectSoundToFile(string inFileName, string outFileName)
{
    var recordInCommand = string.Format(
        "ALTER CALL {0} SET_OUTPUT FILE=\"{1}\"",
        currentCallNumber,
        inFileName);

    var recordOutCommand = string.Format(
        "ALTER CALL {0} SET_CAPTURE_MIC FILE=\"{1}\"",
        currentCallNumber,
        outFileName);

    sendSkypeCommand(recordInCommand);
    sendSkypeCommand(recordOutCommand);
}

Note, that after redirection we will continue hearing the conversation partner, and the microphone will work. Furthermore, a different application can send redirection simultaneously and Skype will handle each separately.

Skype API issues

I figured out that in order to develop an application that will automatically detect Skype presence and have an ability to reconnect, we need to invent some workarounds for the few Skype API issues:

  • We don’t know when Skype starts since we don’t get API messages without a connection. As a workaround we can wait for Skype.exe process, but here is the second issue.
  • Skype API works only on the main screen when we are logged in. The first login screen doesn't answer requests. So, to get to know when exactly we can try to connect, we should wait for the specific window (for example, check its presence by window class name via FindWindow or set global shell hook).
  • Skype doesn't send any API messages if we simply close it (not logout) or kill the process. In that case, our application will still think that it is connected. Furthermore, if Skype starts again after this, we don't receive any messages. We must explicitly reconnect by sending a new request.

The main steps to write a simple recorder

  1. Implement wrapper for Skype API that will allow sending and receiving messages, or use the official library.
  2. Implement some kind of connector that will process received messages and watch for Skype presence.
  3. Implement application settings (filters, black-list, etc).
  4. Implement some logic that will make a decision on how to react to Skype events depends on settings.
  5. Implement post-processing for redirected sound files.

Post-processing redirected sound files

Skype saves sound "channels" separately in WAV format. To get the MP3 file we need to:

  1. Merge files into one file.
  2. Convert WAV to MP3.

The easiest way is to use some external tool. For example, we can use free open-source solutions SoX and LAME.

SoX allows merging, converting, and applying various effects to sound files. To merge two WAV files we can simply create a process with the next parameters:

sox.exe -m FirstInputWavFile SecondInputWavFile OutputWavFile

LAME provides very fast and high-quality conversion of the WAV-file to MP3. To convert the previously merged WAV-file to MP3 we execute LAME with the parameters:

lame.exe -V2 InputWavFile OutputMp3File

Switch -V sets the encoding quality.

That's all. In part 2 I will tell about Skype Auto Recorder - my open-source pet-project for recording Skype calls. All code examples here were taken from it.