Monday, June 16, 2014

Processing ASN.1 Call Detail Records with Hadoop (using Bouncy Castle) Part 2

The Stand-alone Decoder

Now that we have created sample data, we can create a simple decoder with the Bouncy Castle library.


The decompressStream method is a little overkill, but will let the sample data be compressed and handle it fine. This causes a dependency on commons-compress but can also be removed easily (just change to return input).

To iterate through the ASN.1 file, we keep grabbing objects from ASN1InputStream with readObject. Once we have an object, we use it to create a CallDetailRecord instance.


Using Bouncy Castle requires some digging into the data format to get the expected set of classes. Now that the decoder is complete, we can move on to the Map/Reduce job. We didn't have to create a decoder and could have jumped straight into the Map/Reduce job, but creating a simple decoder for the first time I tackle a binary format has always saved me time.

Update: Links to Part 1Part 2Part 3.

No comments:

Post a Comment