Yahoo
Advertisement
Advertisement
Advertisement
Advertisement

I thought Excel was unmatched until I discovered this LibreOffice Calc capability

Icons of Microsoft Excel and LibreOffice Calc floating above a blurred spreadsheet background with charts and graphs.
Lucas Gouveia/How-To Geek

I usually defend Excel come hell or high water. However, when it comes to regular expressions (regexes), even I admit that open-source LibreOffice Calc is superior: it treats them as native search rules, simplifying conditional tasks to a single, clean formula.

Regexes are used to define and find complex patterns in text. If you've ever tried to validate messy data—like counting the number of codes that follow a particular format or extracting a code buried within a comment field—you'll know that wildcards aren't always powerful enough to give you what you want. Regex is the tool you need to achieve these kinds of high-level tasks.

Why LibreOffice Calc wins the regex battle

The core of the problem in Excel isn't that it lacks regex functionality. Indeed, in 2024, Microsoft introduced REGEXEXTRACT, REGEXTEST, and REGEXREPLACE to the long list of functions available in its spreadsheet software. However, it has significant limitations.

Advertisement
Advertisement

First, if you want to use them in the Excel desktop app, you'll need a Microsoft 365 subscription—they're not available to people using perpetual licenses, such as Office 2021. Although they're available in Excel for the web, the online version of the software lacks many other advanced features, meaning those looking to tackle complex tasks often find that it's not up to the job.

Second, even with a Microsoft 365 subscription, the regex functions cannot be used as criteria in existing functions. For example, if you want to count all the cells in a range containing a specific pattern, you can't simply type =COUNTIF(A:A,REGEXTEST(...)) because Excel doesn't natively understand the REGEXTEST output.

On the other hand, Calc supports regex as a core part of its ecosystem. As well as offering a single REGEX function to extract or replace text, any of Calc's standard conditional functions, such as COUNTIF or SUMIF, can use the regex pattern directly within their criteria. As a result, you can use regex reliably, it doesn't cost you a cent, and you end up with a single, clean formula.

While regex support is native to Calc, it's disabled by default. To enable it, in the Tools menu, click Options > LibreOffice Calc > Calc, then check "Enable regular expressions in formulas."

Advertisement
Advertisement

Let's explore some real-world scenarios to see the benefit of using Calc over Excel.

Calc vs. Excel: Conditional counting

Scenario:I want to count the number of cells in column A that contain three uppercase letters, followed by a hyphen, followed by exactly four digits (ABC-1234).

A Calc spreadsheet with codes in column A and the ABC-1234 count format in column C.

The LibreOffice Calc regex formula

In Calc, here's what I can type into cell E2:

where:

  • The COUNTIF function counts the number of cells matching the specified criteria.

  • A:A is the range to evaluate.

  • [A-Z]{3} matches exactly three uppercase letters.

  • The hyphen (-) is matched literally.

  • \\d{4} matches exactly four digits.

The COUNTIF function in Calc, with a regex pattern used as the criterion.

This shows just how intuitive Calc's native regex support is. It uses the familiar COUNTIF function, and because regexes are supported as a core data-matching criterion, there's no need for complex nesting or helper functions. As a result, the formula is clean, readable, and looks exactly how I'd expect a conditional counting formula to look.

The Microsoft Excel alternatives

Because Excel's native functions can't accept regex logic as arguments, when I try to replicate this simple conditional count in Excel for Microsoft 365, I need to use a complex formula, such as one that involves three nested components (SUM, --, and REGEXTEST). As a result, the formula is longer, less intuitive, harder to read, and more difficult to debug:

The REGEXTEST function nested inside SUM in Excel to count the number of codes that follow the ABC-1234 format.

What's more, because REGEXTEXT is only available in Excel for Microsoft 365, if I were using an older Excel desktop app, I'd need to use an even more complex and fragile formula that is a nightmare of nesting . This does the trick, but it's not ideal:

A long, complex formula that would need to be used in older versions of Excel to count the number of cells containing a specific string structure.
Advertisement
Advertisement

Calc vs. Excel: Extracting text from messy data

Scenario:I need to extract a six-digit code (123456) currently in parentheses from a cell containing messy text. Then, I want to see the code without the parentheses.

A Calc spreadsheet with disorganized text values in column A, each containing a six-figure code in red.

The LibreOffice Calc regex formula

To achieve this in Calc, here's what I'll type into cell B2:

where:

  • The SEARCH function finds the starting position of the pattern \\(\\d{6}\\).

  • This pattern matches an opening parenthesis, followed by exactly six digits, followed by a closing parenthesis. The parentheses are escaped with a backslash (\\).

  • The MID function then extracts the six characters, starting at the position identified by the SEARCH function plus one (to skip the opening parenthesis).

Advertisement
Advertisement

Then, I'll double-click the fill handle in the bottom-right corner of cell B2 to expand the formula down column B.

The MID function in Calc, with a regex pattern and the nested SEARCH function used to extract a six-digit code from a text string.

This formula relies on standard, familiar functions and uses the regex functionality within SEARCH to cleanly find the position of the complex pattern. Also, the +1 adjustment logic is transparent, and the 6 at the end clearly identifies the number of characters to take. As a result, the function is highly readable, and the code is highly logical.

The Microsoft Excel alternatives

Even though Excel handles this extraction and removal in one function call (compared to Calc's two), the complexity of the Excel alternative is caused by the need to use two sets of parentheses to define the match boundary and the subgroup to be extracted:

What's more, the final argument (2) adds further complexity—it's not immediately clear what this does. This means the formula is harder to construct and debug than the relatively simple Calc alternative.

The REGEXEXTRACT function in Excel used to extract a number from a messy text string.

This alternative function in an older version of Excel repeats the FIND function to locate the positions of the opening and closing parentheses, uses arithmetic to calculate the exact starting position and length for the MID function , and assumes that the source cell has only one set of parentheses:

The MID and FIND functions used in a formula in legacy Excel to extract a digit string from a messy text string.

Calc vs. Excel: Extracting and reformatting text

Scenario:I have a list of messy product identifiers in column A, often buried in descriptive text or irrelevant symbols. The identifier is always a four-digit number followed by a two-letter suffix (1234AB). My aim is to extract this identifier and reformat it so that the four-digit number and two-letter suffix are separated by a hyphen (1234-AB).

A product information list in column A of a Calc spreadsheet, with the four-digit-two-letter identifiers in each formatted red.
Advertisement
Advertisement

The LibreOffice Calc regex formulas

So far, I've incorporated regex logic into existing functions. However, this scenario is a great case for using the REGEX function alone:

where:

  • The REGEX function extracts the necessary parts and rebuilds the string.

  • .*(\\d{4})([A-Z]{2}).* is the expression. The asterisks (*) ensure the entire cell content is matched and then discarded.

  • (\\d{4}) is the first capture group (referenced by $1), matching and saving exactly four digits.

  • ([A-Z]{2}) is the second capture group (referenced by $2), matching and saving exactly two uppercase letters.

  • "$1-$2" tells Calc to reproduce these groups and separate them using a hyphen.

Then, when I double-click the fill handle, the formula is duplicated to apply the logic to the remaining identifiers.

The REGEX function used in Calc to extract an identifier from a messy text string and reformat it as four numbers, a hyphen, and two letters.

This formula demonstrates how Calc's dedicated REGEX function lets me define the complex extraction logic and the output format in a single, clear statement. I define the cell I'm working on (the text argument), tell Calc what I'm looking for and what I want to capture (the expression argument), and then define how the result should look (the replacement argument). What's more, if the ID format changes (for example, to five digits and one letter), I'd only have to change two characters in the formula.

Advertisement
Advertisement

This example also shows how flexible the function is: rather than needing separate functions for different regex tasks, I simply need to provide the correct number of arguments to execute the appropriate task. Lastly, the formula is easy to read and debug because it doesn't need to count or position characters.

The Microsoft Excel alternatives

Although the equivalent formula in Excel for Microsoft 365 is similar, it serves as a reminder that I'd still need to work out which of the three regex functions I should use—each example in this article uses a different one because there isn't a one-size-fits-all function like there is in Calc. For this task, I need to use REGEXREPLACE:

REGEXREPLACE used in Excel to extract and reformat an identifier from a messy text string.

The benefit of Calc over Excel really becomes clear when I start thinking about how to execute the same task in older versions of Excel. If I didn't use helper columns, I would end up with a formula so long and complex that it's almost unreadable and impossible to debug, and by the time I finished constructing it, I'd need a lie-down.


The comparison is clear: LibreOffice Calc's regex approach is noticeably superior.

Advertisement
Advertisement

Calc's integrated regex has simpler syntax, offers single-function options, and unifies extraction and replacement, whereas Excel requires nesting or knowledge of three separate functions to get to the same result. Also, Calc is free and cross-platform, providing powerful, robust functionality for everyone, while Excel's comparable features are limited to those with deep pockets or to users of the toned-down web version, leaving non-subscribers with fragile legacy formulas or a less robust platform.

Advertisement
Advertisement
Mobilize your Website
View Site in Mobile | Classic
Share by: