CLEAN Function in Excel

Part 1: Introduce

🌟 Definition: The CLEAN function in Microsoft Excel removes non-printable characters from a text string. Non-printable characters are typically invisible and can cause issues when working with imported or copied text that contains special characters or formatting codes.

🌟 Purpose: The purpose of the CLEAN function is to clean up the text by removing non-printable characters. This is particularly useful when working with text that may have been copied from other sources, such as websites or databases, where hidden characters or formatting codes may be present.

🌟 Syntax & Arguments: The syntax of the CLEAN function is as follows:

syntax
CLEAN(text)
  • text: This is the text string from which you want to remove non-printable characters.

🌟 Return value: The CLEAN function returns the cleaned text string with non-printable characters removed.

🌟 Remarks:

  • Non-printable characters include line breaks, tab characters, and other special characters not visible when displayed.
  • The CLEAN function only removes non-printable characters with ASCII codes 0-31, except for ASCII 9 (tab), ASCII 10 (line feed), and ASCII 13 (carriage return).

Part 2: Examples

Let’s explore three examples that demonstrate the usage of the CLEAN function:

1️⃣ Example 1: Removing Line Breaks

AB
1Original TextCleaned Text
2This is a sentence.=CLEAN(A2)
3This is a line break. Here’s a new line.=CLEAN(A3)
4This is a tab character. Here’s some text.=CLEAN(A4)

In this example, we have a column of text strings in column A that may contain non-printable characters. We want to use the CLEAN function to remove those characters and obtain the cleaned text in column B.

  • The formula =CLEAN(A2) cell B2 removes any non-printable characters from the text in cell A2, resulting in “This is a sentence.”
  • The formula in cell B3 removes the line break (ASCII code 10) from the text in cell A3, resulting in “This is a line break. Here’s a new line.”
  • Similarly, the formula in cell B4 removes the tab character (ASCII code 9) from the text in cell A4, resulting in “This is a tab character. Here’s some text.”

2️⃣ Example 2: Cleaning Copied Text

AB
1Original TextCleaned Text
2This is some text with formatting codes.=CLEAN(A2)
3Copy and paste this text. It contains hidden characters.=CLEAN(A3)
4=SUM(A1:A3)=CLEAN(A4)

In this example, we have a column of text strings in column A that may contain formatting codes or hidden characters. We want to use the CLEAN function to remove those unwanted characters and obtain the cleaned text in column B.

  • The formula =CLEAN(A2) cell B2 removes any non-printable characters, including formatting codes, from the text in cell A2, resulting in “This is some text with formatting codes.”
  • The formula in cell B3 removes any hidden characters from the text in cell A3, resulting in “Copy and paste this text. It contains hidden characters.”
  • The formula in cell B4 removes any non-printable characters from the text in cell A4, resulting in “=SUM(A1:A3)” since there are no hidden characters in a formula.

3️⃣ Example 3: Cleaning Imported Data

AB
1Original TextCleaned Text
2Website: www.example.com
Email: [email protected]=CLEAN(A2)
3Serial Number: ABC123 Quantity: 10=CLEAN(A3)
4Special Characters: ©®™=CLEAN(A4)

In this example, we imported data in column A that may contain non-printable or special characters. We want to use the CLEAN function to remove those unwanted characters and obtain the cleaned text in column B.

  • The formula =CLEAN(A2) cell B2 removes any non-printable characters, including line breaks, from the text in cell A2, resulting in “Website: www.example.com Email: [email protected]“.
  • The formula in cell B3 removes the tab character from the text in cell A3, resulting in “Serial Number: ABC123 Quantity: 10”.
  • The formula in cell B4 removes any special characters (such as copyright, registered trademark, or trademark symbols) from the text in cell A4, resulting in “Special Characters:”.

 

4️⃣ Example 4: Removing Non-Numeric Characters from a String

Assume you have a dataset with product codes that contain non-numeric characters. You want to extract only the numeric part of the codes.

A
1Product Code
2P1234
3ABC567
4PQR901

To extract the numeric part of the product codes, you can use the CLEAN function nested with the SUBSTITUTE and VALUE functions. In cell B2, enter the following formula and drag it down to apply it to the remaining cells:

=VALUE(SUBSTITUTE(CLEAN(A2), " ", ""))

Using CLEAN, the formula first cleans the product code in cell A2 by removing any non-printable characters. Then, SUBSTITUTE removes any spaces that may have been created. Finally, VALUE converts the resulting text into a numeric value. The extracted numeric part will be displayed in column B.

5️⃣ Example 5: Removing Leading and Trailing Spaces

In a dataset containing customer names, leading or trailing spaces may need to be removed.

A
1Customer Name
2John Doe
3Jane Smith
4Bob Johnson

You can use the CLEAN function nested with the TRIM function to remove leading and trailing spaces from customer names. In cell B2, enter the following formula and drag it down to apply it to the remaining cells:

=TRIM(CLEAN(A2))

The formula cleans the customer name in cell A2 using CLEAN to remove any non-printable characters, and then TRIM removes any leading and trailing spaces. The cleaned customer names will be displayed in column B.

6️⃣ Example 6: Removing Currency Symbols

Suppose you have a dataset that includes prices with currency symbols and want to extract only the numeric values.

A
1Price
2$12.99
3€25.50
4¥1500

You can use the CLEAN function nested with the SUBSTITUTE function to remove the currency symbols and extract the numeric values. In cell B2, enter the following formula and drag it down to apply it to the remaining cells:

=VALUE(SUBSTITUTE(CLEAN(A2), "$", ""))

The formula first cleans the price in cell A2 by removing any non-printable characters using CLEAN. Then, SUBSTITUTE is used to remove the dollar sign “$”. Finally, VALUE converts the resulting text into a numeric value. The extracted numeric prices will be displayed in column B.

7️⃣ Example 7: Validating Email Addresses

Assume you have a dataset with email addresses and want to validate if they are in the correct format.

A
1Email Address
2[email protected]
3jane@smith
4bobjohnson123gmail.com

You can use the CLEAN function nested with the IF and SEARCH functions to validate email addresses. In cell B2, enter the following formula and drag it down to apply it to the remaining cells:

=IF(AND(CLEAN(A2)=A2, SEARCH("@", CLEAN(A2))>1, SEARCH(".", CLEAN(A2))>SEARCH("@", CLEAN(A2))+1), "Valid", "Invalid")

The formula cleans the email address in cell A2 using CLEAN to remove any non-printable characters. Then, it uses the SEARCH function to check if the cleaned email address contains the “@” symbol and if the dot (“.”) appears after the “@” symbol. The IF function displays “Valid” if all the conditions are met and “Invalid” otherwise. The validation results will be displayed in column B.

These examples demonstrate the versatility of the CLEAN function when nested with other functions to handle various business scenarios, such as extracting specific parts of a string, removing unwanted characters, validating data formats, and more.

Part 3: Tips and Tricks

  • The CLEAN function helps clean up imported or copied text containing hidden formatting codes or non-printable characters.
  • To perform more advanced cleaning operations, you can combine the CLEAN function with other text manipulation functions, such as SUBSTITUTE or TRIM.
  • Be cautious when using the CLEAN function, as it may remove characters intentionally used for formatting or other purposes. Always double-check the results after applying the function.