<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Regex on DonDoIT</title><link>/tags/regex/</link><description>Recent content in Regex on DonDoIT</description><generator>Hugo -- gohugo.io</generator><language>en-us</language><lastBuildDate>Thu, 28 Nov 2024 19:16:14 +0200</lastBuildDate><atom:link href="/tags/regex/index.xml" rel="self" type="application/rss+xml"/><item><title>Regex in Python (part 3)</title><link>/posts/python/regex3/</link><pubDate>Thu, 28 Nov 2024 19:16:14 +0200</pubDate><guid>/posts/python/regex3/</guid><description>&lt;h1 id="find-expression-containing-numbers-and-symbols-in-a-specific-format"&gt;Find expression containing numbers and symbols in a specific format&lt;/h1&gt;
&lt;p&gt;Assuming that we have this piece of text that contains an IPv4 address that we want to extract.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#f92672"&gt;import&lt;/span&gt; re
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;text &lt;span style="color:#f92672"&gt;=&lt;/span&gt; &lt;span style="color:#e6db74"&gt;&amp;#34;You&amp;#39;ve recently logged in from an IP address 111.222.211.122&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The full range of IP addresses can go from 0.0.0.0 to 255.255.255.255, so we can use the following regex pattern to search&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;pattern &lt;span style="color:#f92672"&gt;=&lt;/span&gt; &lt;span style="color:#e6db74"&gt;&amp;#34;\d\d\d.\d\d\d.\d\d\d.\d\d\d&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The result of this will be&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#f92672"&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; print(re&lt;span style="color:#f92672"&gt;.&lt;/span&gt;findall(pattern, text))
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#f92672"&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; [&lt;span style="color:#e6db74"&gt;&amp;#39;111.222.211.122&amp;#39;&lt;/span&gt;]
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;However, if the text now has something extra like this:&lt;/p&gt;</description><content>&lt;h1 id="find-expression-containing-numbers-and-symbols-in-a-specific-format"&gt;Find expression containing numbers and symbols in a specific format&lt;/h1&gt;
&lt;p&gt;Assuming that we have this piece of text that contains an IPv4 address that we want to extract.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#f92672"&gt;import&lt;/span&gt; re
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;text &lt;span style="color:#f92672"&gt;=&lt;/span&gt; &lt;span style="color:#e6db74"&gt;&amp;#34;You&amp;#39;ve recently logged in from an IP address 111.222.211.122&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The full range of IP addresses can go from 0.0.0.0 to 255.255.255.255, so we can use the following regex pattern to search&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;pattern &lt;span style="color:#f92672"&gt;=&lt;/span&gt; &lt;span style="color:#e6db74"&gt;&amp;#34;\d\d\d.\d\d\d.\d\d\d.\d\d\d&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The result of this will be&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#f92672"&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; print(re&lt;span style="color:#f92672"&gt;.&lt;/span&gt;findall(pattern, text))
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#f92672"&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; [&lt;span style="color:#e6db74"&gt;&amp;#39;111.222.211.122&amp;#39;&lt;/span&gt;]
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;However, if the text now has something extra like this:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;text &lt;span style="color:#f92672"&gt;=&lt;/span&gt; &lt;span style="color:#e6db74"&gt;&amp;#34;You&amp;#39;ve recently logged in from an IP address 111.222.211.122, and something weird like this 123123123123122&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;now our search result will be &lt;code&gt;['111.222.211.122', '123123123123122']&lt;/code&gt;. This is because:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;\d\d\d&lt;/code&gt; will try to match any 3 digit numbers&lt;/li&gt;
&lt;li&gt;&lt;code&gt;.&lt;/code&gt; actually will try to match anything, so &lt;code&gt;1231&lt;/code&gt; would match with the pattern &lt;code&gt;\d\d\d.&lt;/code&gt;. The same goes for &lt;code&gt;123!&lt;/code&gt; or &lt;code&gt;123@&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If we want to specifically match the dot &lt;code&gt;.&lt;/code&gt;, add a backslash &lt;code&gt;\&lt;/code&gt; in front of the dot. It&amp;rsquo;s going to be like this: &lt;code&gt;\d\d\d\.&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Extra tips:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the pattern &lt;code&gt;\d+&lt;/code&gt; will help matching a number of any length&lt;/li&gt;
&lt;/ul&gt;</content></item><item><title>Regex in Python (part 2)</title><link>/posts/python/regex2/</link><pubDate>Thu, 21 Nov 2024 18:47:19 +0200</pubDate><guid>/posts/python/regex2/</guid><description>&lt;h1 id="find-words-of-specifc-length-starting-with-specific-letter"&gt;Find words of specifc length starting with specific letter&lt;/h1&gt;
&lt;p&gt;Assuming we want to search for all the 2-character words that start with an &lt;code&gt;i&lt;/code&gt; and ends with e.g &lt;code&gt;s, t, o, n, l&lt;/code&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#f92672"&gt;import&lt;/span&gt; re
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;text &lt;span style="color:#f92672"&gt;=&lt;/span&gt; &lt;span style="color:#e6db74"&gt;&amp;#34;I live in Finland and the cold is killing me&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;pattern &lt;span style="color:#f92672"&gt;=&lt;/span&gt; &lt;span style="color:#e6db74"&gt;&amp;#34;i[stonl]&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;matches &lt;span style="color:#f92672"&gt;=&lt;/span&gt; re&lt;span style="color:#f92672"&gt;.&lt;/span&gt;findall(pattern, text)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;When printing the result &lt;code&gt;matches&lt;/code&gt;, you&amp;rsquo;ll get &lt;code&gt;['in', 'in', 'is', 'il', 'in']&lt;/code&gt;, which come from&lt;/p&gt;
&lt;pre tabindex="0"&gt;&lt;code&gt;I live [in] F[in]land and the cold [is] k[il]l[in]g me
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;This pattern is not only finding a word, but also sub-string of a word that matches the pattern.&lt;/p&gt;</description><content>&lt;h1 id="find-words-of-specifc-length-starting-with-specific-letter"&gt;Find words of specifc length starting with specific letter&lt;/h1&gt;
&lt;p&gt;Assuming we want to search for all the 2-character words that start with an &lt;code&gt;i&lt;/code&gt; and ends with e.g &lt;code&gt;s, t, o, n, l&lt;/code&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#f92672"&gt;import&lt;/span&gt; re
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;text &lt;span style="color:#f92672"&gt;=&lt;/span&gt; &lt;span style="color:#e6db74"&gt;&amp;#34;I live in Finland and the cold is killing me&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;pattern &lt;span style="color:#f92672"&gt;=&lt;/span&gt; &lt;span style="color:#e6db74"&gt;&amp;#34;i[stonl]&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;matches &lt;span style="color:#f92672"&gt;=&lt;/span&gt; re&lt;span style="color:#f92672"&gt;.&lt;/span&gt;findall(pattern, text)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;When printing the result &lt;code&gt;matches&lt;/code&gt;, you&amp;rsquo;ll get &lt;code&gt;['in', 'in', 'is', 'il', 'in']&lt;/code&gt;, which come from&lt;/p&gt;
&lt;pre tabindex="0"&gt;&lt;code&gt;I live [in] F[in]land and the cold [is] k[il]l[in]g me
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;This pattern is not only finding a word, but also sub-string of a word that matches the pattern.&lt;/p&gt;
&lt;h2 id="with-"&gt;With &lt;code&gt;^&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;When we modify our pattern a bit by adding &lt;code&gt;^&lt;/code&gt; so that it would look like this&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;pattern &lt;span style="color:#f92672"&gt;=&lt;/span&gt; &lt;span style="color:#e6db74"&gt;&amp;#34;^i[stonl]&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Now our matches result will be &lt;code&gt;[]&lt;/code&gt;. This is because &lt;code&gt;^&lt;/code&gt; indicate that we&amp;rsquo;re looking for the word or substring of a word, which is at the beginning of the text, or in other word, in this case, &lt;code&gt;i&lt;/code&gt; must be the first character in our text.&lt;/p&gt;
&lt;p&gt;How about &lt;code&gt;&amp;quot;^i[stonl][nm]&amp;quot;&lt;/code&gt;? This means that we&amp;rsquo;re searching for a substring of a word, which starts with &lt;code&gt;i&lt;/code&gt; as the first letter in the text, follow by one of the characters &lt;code&gt;s, t, o, n, l&lt;/code&gt; and ends with either &lt;code&gt;n&lt;/code&gt; or &lt;code&gt;m&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Extra tips:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;We can use &lt;code&gt;$&lt;/code&gt; to search for the pattern at the end of a line. For example, &lt;code&gt;r&amp;quot;me$&amp;quot;&lt;/code&gt; will find any word that ends with &lt;code&gt;me&lt;/code&gt; at the end of the line.&lt;/li&gt;
&lt;li&gt;If we want to specifically search for an independent word, use &lt;code&gt;\b&lt;/code&gt;. For example:
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#f92672"&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; text &lt;span style="color:#f92672"&gt;=&lt;/span&gt; &lt;span style="color:#e6db74"&gt;&amp;#34;This is example0 and example1&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#f92672"&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; pattern &lt;span style="color:#f92672"&gt;=&lt;/span&gt; &lt;span style="color:#e6db74"&gt;r&lt;/span&gt;&lt;span style="color:#e6db74"&gt;&amp;#34;\bexample[01]?\b&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#f92672"&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; print(re&lt;span style="color:#f92672"&gt;.&lt;/span&gt;findall(pattern, text))
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#f92672"&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; [&lt;span style="color:#e6db74"&gt;&amp;#34;example0&amp;#34;&lt;/span&gt;, &lt;span style="color:#e6db74"&gt;&amp;#34;example1&amp;#34;&lt;/span&gt;]
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Hope this is helpful &amp;#x1f60a;.&lt;/p&gt;</content></item><item><title>Regex in Python (part 1)</title><link>/posts/python/regex/</link><pubDate>Thu, 21 Nov 2024 00:05:47 +0200</pubDate><guid>/posts/python/regex/</guid><description>&lt;h1 id="search-for-string-in-text"&gt;Search for string in text&lt;/h1&gt;
&lt;p&gt;Assuming that we have a text &lt;code&gt;the quick brown fox jumped over the lazy dog&lt;/code&gt;, and we want to search for e.g &lt;code&gt;quick&lt;/code&gt; in the text.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#f92672"&gt;import&lt;/span&gt; re
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;text &lt;span style="color:#f92672"&gt;=&lt;/span&gt; &lt;span style="color:#e6db74"&gt;&amp;#34;the quick brown fox jumped over the lazy dog&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#66d9ef"&gt;match&lt;/span&gt; &lt;span style="color:#f92672"&gt;=&lt;/span&gt; re&lt;span style="color:#f92672"&gt;.&lt;/span&gt;search(&lt;span style="color:#e6db74"&gt;&amp;#34;quick&amp;#34;&lt;/span&gt;, text)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;As said in &lt;code&gt;.search()&lt;/code&gt; &lt;a href="https://docs.python.org/3/library/re.html#re.Pattern.search"&gt;documentation&lt;/a&gt;, this method will look for the first location where it finds a match, and returns a &lt;a href="https://docs.python.org/3/library/re.html#re.Match"&gt;&lt;code&gt;re.Match&lt;/code&gt;&lt;/a&gt; object if found, otherwise returns &lt;code&gt;None&lt;/code&gt;.&lt;/p&gt;</description><content>&lt;h1 id="search-for-string-in-text"&gt;Search for string in text&lt;/h1&gt;
&lt;p&gt;Assuming that we have a text &lt;code&gt;the quick brown fox jumped over the lazy dog&lt;/code&gt;, and we want to search for e.g &lt;code&gt;quick&lt;/code&gt; in the text.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#f92672"&gt;import&lt;/span&gt; re
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;text &lt;span style="color:#f92672"&gt;=&lt;/span&gt; &lt;span style="color:#e6db74"&gt;&amp;#34;the quick brown fox jumped over the lazy dog&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#66d9ef"&gt;match&lt;/span&gt; &lt;span style="color:#f92672"&gt;=&lt;/span&gt; re&lt;span style="color:#f92672"&gt;.&lt;/span&gt;search(&lt;span style="color:#e6db74"&gt;&amp;#34;quick&amp;#34;&lt;/span&gt;, text)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;As said in &lt;code&gt;.search()&lt;/code&gt; &lt;a href="https://docs.python.org/3/library/re.html#re.Pattern.search"&gt;documentation&lt;/a&gt;, this method will look for the first location where it finds a match, and returns a &lt;a href="https://docs.python.org/3/library/re.html#re.Match"&gt;&lt;code&gt;re.Match&lt;/code&gt;&lt;/a&gt; object if found, otherwise returns &lt;code&gt;None&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;If we &lt;code&gt;print(match)&lt;/code&gt;, we&amp;rsquo;ll see &lt;code&gt;&amp;lt;re.Match object; span=(4, 9), match='quick'&amp;gt;&lt;/code&gt; which indicate that the matching string starts at the index &lt;code&gt;4&lt;/code&gt; and ends at index &lt;code&gt;9&lt;/code&gt; exclusively.&lt;/p&gt;
&lt;p&gt;To get the matched value that the &lt;code&gt;re.Match&lt;/code&gt; object is holding, we can simply use&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#66d9ef"&gt;match&lt;/span&gt;&lt;span style="color:#f92672"&gt;.&lt;/span&gt;group()
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h1 id="find-characters-by-type"&gt;Find characters by type&lt;/h1&gt;
&lt;p&gt;Assuming we&amp;rsquo;re now working with a slightly different bit of text from the example above&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#f92672"&gt;import&lt;/span&gt; re
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;text &lt;span style="color:#f92672"&gt;=&lt;/span&gt; &lt;span style="color:#e6db74"&gt;&amp;#34;the quick brown fox jumped over the lazy dog 1234567890 !@#$%^&amp;amp;*()_&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id="find-alphanumeric-characters"&gt;Find alphanumeric characters&lt;/h2&gt;
&lt;p&gt;To find all the word characters, we can use regex expression &lt;a href="https://learn.microsoft.com/en-us/dotnet/standard/base-types/character-classes-in-regular-expressions#word-character-w"&gt;&lt;code&gt;\w&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;characters &lt;span style="color:#f92672"&gt;=&lt;/span&gt; re&lt;span style="color:#f92672"&gt;.&lt;/span&gt;findall(&lt;span style="color:#e6db74"&gt;&amp;#34;\w&amp;#34;&lt;/span&gt;, text)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;When printing the result &lt;code&gt;characters&lt;/code&gt;, we&amp;rsquo;ll get all the characters in the text splited into a list, however, &lt;code&gt;!@#$%^&amp;amp;*()&lt;/code&gt; won&amp;rsquo;t be returned as they are not considered word characters, &lt;strong&gt;except&lt;/strong&gt; &lt;code&gt;_&lt;/code&gt;.&lt;/p&gt;
&lt;pre tabindex="0"&gt;&lt;code&gt;[&amp;#39;t&amp;#39;,&amp;#39;h&amp;#39;,&amp;#39;e&amp;#39;,&amp;#39;q&amp;#39;,&amp;#39;u&amp;#39;,&amp;#39;i&amp;#39;,&amp;#39;c&amp;#39;,&amp;#39;k&amp;#39;,&amp;#39;b&amp;#39;,&amp;#39;r&amp;#39;,&amp;#39;o&amp;#39;,&amp;#39;w&amp;#39;,&amp;#39;n&amp;#39;,&amp;#39;f&amp;#39;,&amp;#39;o&amp;#39;,&amp;#39;x&amp;#39;,&amp;#39;j&amp;#39;,&amp;#39;u&amp;#39;,&amp;#39;m&amp;#39;,&amp;#39;p&amp;#39;,&amp;#39;e&amp;#39;,&amp;#39;d&amp;#39;,&amp;#39;o&amp;#39;,&amp;#39;v&amp;#39;,&amp;#39;e&amp;#39;,&amp;#39;r&amp;#39;,&amp;#39;t&amp;#39;,&amp;#39;h&amp;#39;,&amp;#39;e&amp;#39;,&amp;#39;l&amp;#39;,&amp;#39;a&amp;#39;,&amp;#39;z&amp;#39;,&amp;#39;y&amp;#39;,&amp;#39;d&amp;#39;,&amp;#39;o&amp;#39;,&amp;#39;g&amp;#39;,&amp;#39;1&amp;#39;,&amp;#39;2&amp;#39;,&amp;#39;3&amp;#39;,&amp;#39;4&amp;#39;,&amp;#39;5&amp;#39;,&amp;#39;6&amp;#39;,&amp;#39;7&amp;#39;,&amp;#39;8&amp;#39;,&amp;#39;9&amp;#39;,&amp;#39;0&amp;#39;,&amp;#39;_&amp;#39;]
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id="find-any-characters"&gt;Find any characters&lt;/h2&gt;
&lt;p&gt;To find any character, doesn&amp;rsquo;t matter if it&amp;rsquo;s word character or not, use &lt;a href="https://learn.microsoft.com/en-us/dotnet/standard/base-types/character-classes-in-regular-expressions#any-character-"&gt;&lt;code&gt;.&lt;/code&gt;&lt;/a&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;any_characters &lt;span style="color:#f92672"&gt;=&lt;/span&gt; re&lt;span style="color:#f92672"&gt;.&lt;/span&gt;findall(&lt;span style="color:#e6db74"&gt;&amp;#34;.&amp;#34;&lt;/span&gt;, text)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt; that now the result also contains whitespaces &lt;code&gt;' '&lt;/code&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;[&lt;span style="color:#e6db74"&gt;&amp;#39;t&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;h&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;e&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39; &amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;q&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;u&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;i&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;c&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;k&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39; &amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;b&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;r&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;o&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;w&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;n&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39; &amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;f&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;o&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;x&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39; &amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;j&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;u&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;m&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;p&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;e&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;d&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39; &amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;o&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;v&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;e&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;r&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39; &amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;t&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;h&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;e&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39; &amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;l&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;a&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;z&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;y&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39; &amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;d&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;o&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;g&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39; &amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;1&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;2&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;3&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;4&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;5&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;6&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;7&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;8&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;9&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;0&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39; &amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;!&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;@&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;#&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;$&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;%&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;^&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;&amp;amp;&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;*&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;(&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;)&amp;#39;&lt;/span&gt;,&lt;span style="color:#e6db74"&gt;&amp;#39;_&amp;#39;&lt;/span&gt;]
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id="find-non-word-characters"&gt;Find non-word characters&lt;/h2&gt;
&lt;p&gt;Opposite to &lt;code&gt;\w&lt;/code&gt;, we have &lt;a href="https://learn.microsoft.com/en-us/dotnet/standard/base-types/character-classes-in-regular-expressions#non-word-character-w"&gt;&lt;code&gt;\W&lt;/code&gt; (uppercase)&lt;/a&gt; that we can use to find all non-word characters&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;non_word_characters &lt;span style="color:#f92672"&gt;=&lt;/span&gt; re&lt;span style="color:#f92672"&gt;.&lt;/span&gt;findall(&lt;span style="color:#e6db74"&gt;&amp;#34;.&amp;#34;&lt;/span&gt;, text)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The result now only contains whitespaces and symbols characters&lt;/p&gt;
&lt;pre tabindex="0"&gt;&lt;code&gt;[&amp;#39; &amp;#39;, &amp;#39; &amp;#39;, &amp;#39; &amp;#39;, &amp;#39; &amp;#39;, &amp;#39; &amp;#39;, &amp;#39; &amp;#39;, &amp;#39; &amp;#39;, &amp;#39; &amp;#39;, &amp;#39; &amp;#39;, &amp;#39; &amp;#39;, &amp;#39;!&amp;#39;, &amp;#39;@&amp;#39;, &amp;#39;#&amp;#39;, &amp;#39;$&amp;#39;, &amp;#39;%&amp;#39;, &amp;#39;^&amp;#39;, &amp;#39;&amp;amp;&amp;#39;, &amp;#39;*&amp;#39;, &amp;#39;(&amp;#39;, &amp;#39;)&amp;#39;]
&lt;/code&gt;&lt;/pre&gt;</content></item></channel></rss>