How can Python determine Chinese text?

1 year ago

Liam

1 minute

To determine if a string is in Chinese, you can use a regular expression to match Chinese characters. For example, use the search function in the re module to check if the string contains Chinese characters.

import re

def is_chinese(string):
    pattern = re.compile('[\u4e00-\u9fa5]')
    match = pattern.search(string)
    if match:
        return True
    else:
        return False

# 示例用法
print(is_chinese("Hello, 你好"))  # True
print(is_chinese("Hello World"))  # False

In the above code, Chinese characters are matched using the regular expression [\u4e00-\u9fa5]. If the string contains any Chinese character, it will return True; otherwise, it will return False.