Using RegEx (Regular Expression Extractor) with JMeter

USING REGULAR EXPRESSION EXTRACTOR

JMeter can work with regular expressions. This feature can be useful for extracting some information from response body. For example, you request for some page and then need to get link from the page that was downloaded. In this article I’m going to share some syntax details of using regular expressions in JMeter.

I created very simple test-plan, look at the Figure 1.

New test plan
Figure 1


You can notice one unknown element in the picture; it is Regular Expression Extractor post-processor. Let's look at it more closely, Figure 2.

Regular Expression Extractor
Figure 2


In general, JMeter regular expressions use the same syntax as Perl5. But there is one very important difference between JMeter and Perl regexps processing. In Perl you have to use “//” delimiter to specify regexp. Here is way, how it can appear - ~/regular_expression/. But you cannot use “//” for the same goal in JMeter, it will be parsed like literal. So, if you use grouping in regular expression, use “()” parentheses to separate one group from another.

I will shortly describe all fields of this element.

“Apply to” checkbox. Useful in case if sample has child samples that request for embedded resources. This parameter defines will be regular expression be applied to either only main sample results or to the embedded resources too.

Main sample only - only applies to the main sample.

Sub-samples only - only applies to the sub-samples.

Main sample and sub-samples - applies to both main sample and sub-samples.

JMeter Variable - assertion is to be applied to the contents of the named variable, that can be filled by another request.

“Response field to check” check-box.This parameter defines to which field regular expression should be applied.

The possible options are:

Body - the body of the response. In other words, the content of a web-page, excluding headers, will be parsed with regular expression.

Body (unescaped) - the body of the response, with all Html escape codes replaced. Note that Html escapes are processed without regard to context, so some incorrect substitutions may be made.

Headers - may not be present for non-HTTP samples

URL – URL of request will be parsed with regular expression.

Response Code - e.g. 200

Response Message - e.g. OK

Reference Name. This parameter contains name of variable were parsing results will be saved.

Regular expression. Field for regular expression itself.

Template. The template used to create a string from the matches found. This is an arbitrary string with special elements to refer to groups within the regular expression. The syntax to refer to a group is: '$1$' to refer to group 1, '$2$' to refer to group 2, etc. $0$ refers to whatever the entire expression matches. So, if you have in response word “economics” and search for regular expression “(ec)(onomics)” and apply template $2$$1$ than in output variable you will receive “onomicsec”.

Match ¹. If there is several character sequences, allows specifying, which variant exactly should be used. Important note. If you set “Apply to” to “Main sample and sub-samples” and specify “Match ¹” = 3, than JMeter will select matching sequence from the 2nd sub-sample because 1st will be main sample. If zero is specified, JMeter will choose a match at random. If you specify negative number, e.g. “-2”

If the match number is set to a negative number, then all the possible matches in the sampler data are processed. The variables are set as follows:

refName_matchNr - the number of matches found; could be 0

refName_n, where n = 1,2,3 etc - the strings as generated by the template

refName_n_gm, where m=0,1,2 - the groups for match n

refName - always set to the default value

refName_gn - not set

Indicates which match to use. The regular expression may match multiple times.

Use a value of zero to indicate JMeter should choose a match at random.

A positive number N means to select the nth match.

That’s all about options of Regular Expression Extractor. And now I will show a couple of practical examples. In all examples I will use the same URL for extracting string by regexp, see Figure 3.

HTTP Request Defaults
Figure 3


After extracting string it will be put to variable $pageLink and used in “pageLink” HTTP Request as it displayed in Figure 4.

HTTP Request
Figure 4


Searching by word. If you need to extract string with regular expression that is a single word than fill Regular Extractor as in Figure 5.

Regular expression
Figure 5


After executing “tut.by” request and extracting regexp, we will get the following $pageLink = economics, and that will be used in “pageLink” request, Figure 6.

View results tree
Figure 6


Using groups. You can move parts of regular expressions using groups. For example, you need to find word “economics”, but before putting it to $pageLink you want to rearrange parts of word. Look at the Figure 7 for the syntax

Using groups
Figure 7


And what we’ll have in View Results Tree

View results tree
Figure 8


Using classes in regexps. Regular expressions can use classes of characters. For example, [0-9] means “any of numeric symbol ”. If I set regexp as in Figure 9, than I will receive the 3rd appropriate result from response body.

Classes in regular expressions
Figure 9


“{5,6}” means that result should contain no less then 5 and no more then 6 characters. And what we will have in View Results Tree in Figure 10

View results tree
Figure 10


Using “^”. “^” means inversion, e.g. regular expression [^0-9] will look for non-numeric symbols. So, I’ll set regexp as in Figure 11

Using ^
Figure 11


And in View results Tree I will have very interesting situation, Figure 12

Figure 12
Figure 12


What happened? Look at Figure 13

Figure 13
Figure 13


We caught “carriage return” symbol and this is a reason of java.net.MalformedURLException. To repair regexp I’ll add “<” before it and restart test. Now it’s ok.

Figure 14
Figure 14


Of course, I cannot cover in one article all possible and impossible cases about using regular expressions. For more information you can refer to JMeter Regular Expressions Tutorial which has exhaustive information.

JMeter uses Jakarta ORO for regular expressions processing. You can quickly test your regular expressions using Jakarta ORO Demonstration Applet which is the fastest way of seeing result matches/groups/etc.

Want to take your JMeter testing to the next level?

Run your own JMeter scripts in the cloud (JMeter-as-a-service) with up to 100,000 concurrent users, real time reporting and nice looking graphs with BlazeMmeter :)



BlazeMeter

provides a performance testing solution that's 100% compatible with Apache JMeter™

Feedback and Knowledge Base