I wan a regex to alidate all types of possible DN's
I create one but its not so good.
/([A-z0-9=]{1}[A-z0-9]{1})*[,??]/
and some others by changing it, but in vain.
Posible DN's can be
CN=abcd,CN=abcd,O=abcd,C=us
CN=abcd0520,CN=users,O=abcd,C=us
C=us
etc
I wan a regex to alidate all types of possible DN's
I create one but its not so good.
/([A-z0-9=]{1}[A-z0-9]{1})*[,??]/
and some others by changing it, but in vain.
Posible DN's can be
CN=abcd,CN=abcd,O=abcd,C=us
CN=abcd0520,CN=users,O=abcd,C=us
C=us
etc
Share
Improve this question
asked Feb 15, 2012 at 7:19
Muhammad Imran TariqMuhammad Imran Tariq
23.4k47 gold badges129 silver badges191 bronze badges
7
- For what purpose? The LDAP server will tell you if the DN is ill-formed. – user207421 Commented Feb 15, 2012 at 8:10
- User will enter DN in form. I want to validate it on clientside. – Muhammad Imran Tariq Commented Feb 15, 2012 at 9:26
- That seems like part of the problem to me. Users shouldn't know an LDAP DN from a hole in the ground. The user should enter something that uniquely identifies the entry concerned, i.e. a uid, and you should then search for that entry. – user207421 Commented Feb 15, 2012 at 9:40
- 1 e on, user can administrator. – Muhammad Imran Tariq Commented Feb 15, 2012 at 10:06
- 2 Cannot it be base DN? If you know about LDAP connectivity then you should admit it. Also please just focus on question i.e.(Regex for DN). Thanks. – Muhammad Imran Tariq Commented Feb 16, 2012 at 12:18
4 Answers
Reset to default 17I recently had a need for this, so I created one that perfectly follows the LDAPv3 distinguished name syntax at RFC-2253.
Attribute Type
An attributeType can be expressed 2 ways. An alphanumeric string that starts with an alpha, validated using:
[A-Za-z][\w-]*
Or it can be an OID, validated using:
\d+(?:\.\d+)*
So attributeType validates using:
[A-Za-z][\w-]*|\d+(?:\.\d+)*
Attribute Value
An attributeValue can be expressed 3 ways. A hex string, which is a sequence of hex-pairs with a leading #
. A hex string validates using:
#(?:[\dA-Fa-f]{2})+
Or an escaped string; each non-special character is expressed "as-is" (validates using [^,=\+<>#;\\"]
). Special characters can be expressed with a leading \
(validates using \\[,=\+<>#;\\"]
). Finally any character can be expressed as a hex-pair with a leading \
(validates using \\[\dA-Fa-f]{2}
). An escaped string validates using:
(?:[^,=\+<>#;\\"]|\\[,=\+<>#;\\"]|\\[\dA-Fa-f]{2})*
Or a quoted-string; the value starts and ends with "
, and can contain any character un-escaped except \
and "
. Additionally, any of the methods from the escaped string above can be used. A quoted-string validates using:
"(?:[^\\"]|\\[,=\+<>#;\\"]|\\[\dA-Fa-f]{2})*"
All bined, an attributeValue validates using:
#(?:[\dA-Fa-f]{2})+|(?:[^,=\+<>#;\\"]|\\[,=\+<>#;\\"]|\\[\dA-Fa-f]{2})*|"(?:[^\\"]|\\[,=\+<>#;\\"]|\\[\dA-Fa-f]{2})*"
Name ponent
A name-ponent in BNF is:
name-ponent = attributeTypeAndValue *("+" attributeTypeAndValue)
attributeTypeAndValue = attributeType "=" attributeValue
In RegEx is:
(?#attributeType)=(?#attributeValue)(?:\+(?#attributeType)=(?#attributeValue))*
Replacing the (?#attributeType)
and (?#attributeValue)
placeholders with the values above gives us:
(?:[A-Za-z][\w-]*|\d+(?:\.\d+)*)=(?:#(?:[\dA-Fa-f]{2})+|(?:[^,=\+<>#;\\"]|\\[,=\+<>#;\\"]|\\[\dA-Fa-f]{2})*|"(?:[^\\"]|\\[,=\+<>#;\\"]|\\[\dA-Fa-f]{2})*")(?:\+(?:[A-Za-z][\w-]*|\d+(?:\.\d+)*)=(?:#(?:[\dA-Fa-f]{2})+|(?:[^,=\+<>#;\\"]|\\[,=\+<>#;\\"]|\\[\dA-Fa-f]{2})*|"(?:[^\\"]|\\[,=\+<>#;\\"]|\\[\dA-Fa-f]{2})*"))*
Which validates a single name-ponent.
Distinguished name
Finally, the BNF for a distinguished name is:
name-ponent *("," name-ponent)
In RegEx is:
(?#name-ponent)(?:,(?#name-ponent))*
Replacing the (?#name-ponent) placeholder with the value above gives us:
^(?:[A-Za-z][\w-]*|\d+(?:\.\d+)*)=(?:#(?:[\dA-Fa-f]{2})+|(?:[^,=\+<>#;\\"]|\\[,=\+<>#;\\"]|\\[\dA-Fa-f]{2})*|"(?:[^\\"]|\\[,=\+<>#;\\"]|\\[\dA-Fa-f]{2})*")(?:\+(?:[A-Za-z][\w-]*|\d+(?:\.\d+)*)=(?:#(?:[\dA-Fa-f]{2})+|(?:[^,=\+<>#;\\"]|\\[,=\+<>#;\\"]|\\[\dA-Fa-f]{2})*|"(?:[^\\"]|\\[,=\+<>#;\\"]|\\[\dA-Fa-f]{2})*"))*(?:,(?:[A-Za-z][\w-]*|\d+(?:\.\d+)*)=(?:#(?:[\dA-Fa-f]{2})+|(?:[^,=\+<>#;\\"]|\\[,=\+<>#;\\"]|\\[\dA-Fa-f]{2})*|"(?:[^\\"]|\\[,=\+<>#;\\"]|\\[\dA-Fa-f]{2})*")(?:\+(?:[A-Za-z][\w-]*|\d+(?:\.\d+)*)=(?:#(?:[\dA-Fa-f]{2})+|(?:[^,=\+<>#;\\"]|\\[,=\+<>#;\\"]|\\[\dA-Fa-f]{2})*|"(?:[^\\"]|\\[,=\+<>#;\\"]|\\[\dA-Fa-f]{2})*"))*)*$
Test it here
This is not only not possible, it will never work, and should not even be attempted. LDAP data (distinguished name in this case) are not strings. A distinguished name has distinguishedName
syntax, which is not a string, and parisons must be made with using matching rules defined in the directory server schema. For this reason, regular expressions and native-language parison, relative value, and equality operations like perl's ~~
, eq
and ==
and Java's ==
cannot be used with LDAP data - if a programmer attempts this, unexpected results can occur and the code is brittle, fragile, unpredictable, and does not have repeatable characteristics. Language LDAP APIs that do not support matching rules cannot be used with LDAP where parison, equality checks, and relative value ordering parisons are required.
By way of example, the distinguished names "dc=example,dc=
" and "DC=example, DC=COM
" are equivalent in every way from an LDAP perspective, but native language equality operators would return false
.
This worked for me:
Expression:
^(?<RDN>(?<Key>(?:\\[0-9A-Fa-f]{2}|\\\[^=\,\\]|[^=\,\\]+)+)\=(?<Value>(?:\\[0-9A-Fa-f]{2}|\\\[^=\,\\]|[^=\,\\]+)+))(?:\s*\,\s*(?<RDN>(?<Key>(?:\\[0-9A-Fa-f]{2}|\\\[^=\,\\]|[^=\,\\]+)+)\=(?<Value>(?:\\[0-9A-Fa-f]{2}|\\\[^=\,\\]|[^=\,\\]+)+)))*$
Test:
CN=Test User Delete\0ADEL:c1104f63-0389-4d25-8e03-822a5c3616bc,CN=Deleted Objects,DC=test,DC=domain,DC=local
The expression is already Regex escaped so to avoid having to repeat all the backslashes in C# make sure you prefix the string with the non-escaped literal @ sign, i.e.
var dnExpression = @"...";
This will yield four groups, first a copy of the whole string, second a copy of the last RDN, third and fourth the key/value pairs. You can index into each key/value using the Captures collection of each group.
You can also use this to validate a RDN by cutting the expression to the "(?...)" group surrounded by the usual "^...$" to required a whole value (start-end of string).
I've allowed a hex special character escape "\", simple character escape "\" or anything other than ",=\" inside the key/value DN text. I'd guess this expression could be perfected by taking extra time to go through the MSDN AD standard and restrict the allowed characters to match exactly what is or is not allowed. But I believe this is a good start.
I created one. Working great.
^(\w+[=]{1}\w+)([,{1}]\w+[=]{1}\w+)*$