Manually Parsing an x509 Certificate in PHP
One might ask why you would ever need to manually parse an x509 certificate in PHP when there are handy functions like openssl_x509_parse just laying around. The answer? To verify the certificate's signature. Why can't you just use openssl_x509_checkpurpose you say? Well, because there are bugs in certain combinations of PHP and openssl pertaining to this function and the 'any' purpose. The PHP and openssl that come with centos by default just so happen to have these bugs and I write software that has to run on centos with it's defaults.
If you are brave enough, read on...
So now that we are past the why, lets talk about the what. What is it that I need to do? I need to parse an x509 certificate and verify that it is signed by a certain certificate authority. The process of doing this is:
- base64 decode the certificate
- decode the ASN.1 BER formatted data
- get the signature element from the decoded data
- decrypt the signature element based on the specified algorithm
- parse the ASN.1 BER formatted signature data
- compare the signature value (hash of the type specified in the certificate) with the hash of the DER encoded certificate data
Step number 2 above is what this article is going to deal with. First things first, if you want a pretty good explaination of ASN.1 and related topics, go here: http://luca.ntop.org/Teaching/Appunti/asn1.html. One thing to note is that my class for parsing only really deals with universal types. I had no need to write code to deal with application, private, or context specific types.
So we need to know what we have and what we want. We have a string of bytes, we want a PHP array that represents what is stored in those bytes. Yes, I said array, not object. I chose to have the output of my parsing engine be an array for simplicity. For my purposes there is no reason to represent the output as a complicated structure of objects and nested objects, a simple array will do.
Our string of bytes to parse is of the form "<tag><length><data>". Where:
- <tag> is a single byte that tells the type of data to expect
- <length> is 1 or more bytes that express the length of the data to expect
- <data> is the data (didn't see that one coming did you?)
There are a multitude of data types but there are really only a few that you need to worry about. Those are integer, bit string, octet string, null, object identifier, sequence, set, printable string, t61 string, ia5 string, and utc time.
Using this knowledge you can determine that you basically need to read the tag, then read the length, then read the data and process the data, then start over. The byte that comes after the data is the tag for the next piece of data. For sequences and sets, the data is just another ASN.1 BER encoded set of items, using these you can create nested structures.
So on to the parsing
. If we were to describe in english what the parsing is going to do, it would be:
- read the first byte to determine data type
- read the second byte to determine LENGTH
- if the length byte is greater than 127 then read the next x many bytes where x is determined by the length byte - 128
- convert those to an int value and that is your LENGTH
- read LENGTH bytes from our string and that is your data
- use the type of data to process that data as needed: If the data type is a sequence or set then run our decoding routines on that data and put the return value in place of this data (recursion).
There is one really tricky data type that you will run into called an Object Identifier, or OID. The basic idea is that you have a sequence with 1 or more values where the first value is the OID and the rest are pieces of data pertaining to what the OID represents. To see a pretty good list of OID's and what they represent, you can go here: http://www.oid-info.com/search.htm.
The data that we have is a byte string that we need to change into a dot-notated OID. An example of a dot-notated OID is "1.2.840.113549.1.1.5" which is the OID for "sha1WithRSAEncryption". Section 5.9 of http://luca.ntop.org/Teaching/Appunti/asn1.html has a much better explaination of how to decode OID's than I could ever write so if you want to know more you can go read that. The short description is that the first two pieces of the OID are represented in the first byte, and the rest follow that as one or more bytes each. The first piece = floor(first byte / 40), the second piece = first byte % 40.
As a side note, the signature data in XML documents that use xmlSecurity is an ASN.1 BER encoded string. I have had need to manually parse and verify the signatures on some XML documents that use xmlSecurity before and I use this parsing code for that as well as for regular x509 cerificate parsing.
So now that I have described what is going to happen, here is how to do it in PHP.
- Download my ASN.1 parsing class from here.
(Note that the class also contains functions to DER encode data that produce output that is binary compatible with what openssl produces when doing DER encoding, but that is not what this article is about, save some for another day!) - include it in your PHP file somewhere.
- insert this code:
- "-----BEGIN CERTIFICATE-----",
- "-----END CERTIFICATE-----"
- ), '', $cert));
- $parsedCert = ASN1::parseASNString($rawCertData);
If you run this and put a valid path to a certificate then you will get a large array printed on the screen. The output is of the format:
array([data type], [data])
Where the data can be an array of nested data, where each item can also have nested data. This can go on infinitly of course. If you do not want to build this class into a system that you have already, there is an example file included in the archive named test.php that you can just run to see the output. It uses GEO Trust's root certificate as the input.
To see that demo go here: http://krisbailey.com/demos/asn1/test.php
Any questions? Corrections?
Did you enjoy this post? Why not leave a comment below and continue the conversation, or subscribe to my feed and get articles like this delivered automatically to your feed reader.
Comments
Kris, am I missing something? Where is the code to encode to der in asn1.php?
I am trying to figure out how to manually verify signatures. From my limited understanding, you decrypt the signature with the public key of the issuing CA, then md5 that decrypted signature, and then md5 hash ???? and compare?
Would love to see your code that implements steps 5 and 6 above
Here’s my optimised version of the OID value parsing:
// Create DER encoded value of OID’s value in dot notation
// Example input: “1.2.840.113549.1.1.4″
// Example output: “\x2A\x86\x48\x86\xF7\x0D\x01\x01\x04″
function createOID($dot_notation) {
// Create array from dot notation
$values = explode(‘.’, $dot_notation);
// Validate input
if (count($values) 2) || ($values[1] > 39))
return false;
// Construct first byte
$oid = chr(40 * $values[0] + $values[1]);
// Construct remaining bytes
$count = count($values);
for ($i = 2; $i >= 7;
} while ($value > 0);
// Add next value
$oid.= strrev($stack);
}
return $oid;
}
// Get dot notation form of DER encoded OID’s value
// Example input: “\x2A\x86\x48\x86\xF7\x0D\x01\x01\x04″
// Example output: “1.2.840.113549.1.1.4″
function parseOID($oid) {
$values = array();
$stack = array();
// Parse first byte
$values[] = (int) (ord($oid[0]) / 40);
$values[] = ord($oid[0]) % 40;
// Parse remaining bytes
$count = strlen($oid);
for ($i = 1; $i < $count; $i++) {
$value = ord($oid[$i]);
$stack[] = $value & 127;
if ($value < 128) {
$sum = 0;
foreach ($stack as $value) {
$sum <<= 7;
$sum += $value;
}
// Add next value
$values[] = $sum;
$stack = array();
}
}
// Validate input
if (count($values) 2) || ($values[1] > 39))
return false;
// Create dot notation from array
return implode(‘.’, $values);
}

[...] the original: Manually Parsing an x509 Certificate in PHP Related ArticlesBookmarksTags PHP PHP is a computer scripting language. Originally [...]