We know how to search or replace text in our Delphi/C++ Builder using Regular Expressions. Similarly, Python’s standard library and some other low-level modules help to search or replace text using Regular Expressions. However, to perform the search or replace keywords in documents at scale from within Python, the performance of python standard library regular expressions is not fast enough. To solve this, a high-performance Python Library FlashText is available. In this post will get to understand how to use FlashText using Python4Delphi in the Delphi/C++ application for Python GUI apps.
Delphi itself offers fast processing of text via the TRegEx object. If you have an existing Python application and you need faster text processing you could use FlashText or you could bring the text over to Delphi via Python4Delphi and process the text from inside Delphi itself. You can use Python4Delphi a number of different ways such as:
- Create a Windows GUI around you existing Python app.
- Add Python scripting to your Delphi Windows apps.
- Add parallel processing to your Python apps through Delphi threads.
- Enhance your speed sensitive Python apps with functions from Delphi for more speed.
Prerequisites.
- If not python and Python4Delphi is not installed on your machine, Check this, how to run a simple python script in Delphi application using Python4Delphi sample app
- Open windows open command prompt, and type pip install -U flashtext to install FlashText. For more info for Installing Python Modules check here
- First, run the Demo1 project for executing Python script in Python for Delphi. Then load the script in the Memo1 field and press the Execute Script button to see the result. Go to GitHub to download the Demo1 source.
FlashText Python Library sample script details: The sample script demonstrates,
- How the Extract or replace keywords in the documentation.
- Extract case sensitive, a span of keywords, get extra information for the keyword extracted.
- Add Multiple keywords simultaneously, remove keywords.
- To check the Number of terms in KeywordProcessor and if the term is present in KeywordProcessor. Get all keywords in the dictionary.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 |
import flashtext from flashtext import KeywordProcessor keyword_processor = KeywordProcessor() # keyword_processor.add_keyword(<unclean name>, <standardised name>) keyword_processor.add_keyword('Big Apple', 'New York') keyword_processor.add_keyword('Bay Area') keywords_found = keyword_processor.extract_keywords('I love Big Apple and Bay Area.') print(keywords_found) # Repalce keywords keyword_processor.add_keyword('New Delhi', 'NCR region') new_sentence = keyword_processor.replace_keywords('I love Big Apple and new delhi.') print(new_sentence) # Case sesitive example keyword_processor = KeywordProcessor(case_sensitive=True) keyword_processor.add_keyword('Big Apple', 'New York') keyword_processor.add_keyword('Bay Area') keywords_found = keyword_processor.extract_keywords('I love big Apple and Bay Area.') print(keywords_found) #Span of keywords extracted keyword_processor = KeywordProcessor() keyword_processor.add_keyword('Big Apple', 'New York') keyword_processor.add_keyword('Bay Area') keywords_found = keyword_processor.extract_keywords('I love big Apple and Bay Area.', span_info=True) print(keywords_found) #Get Extra information with keywords extracted kp = KeywordProcessor() kp.add_keyword('Taj Mahal', ('Monument', 'Taj Mahal')) kp.add_keyword('Delhi', ('Location', 'Delhi')) print(kp.extract_keywords('Taj Mahal is in Delhi.')) #Add Multiple Keywords simultaneously keyword_processor = KeywordProcessor() keyword_dict = { "java": ["java_2e", "java programing"], "product management": ["PM", "product manager"] } keyword_processor.add_keywords_from_dict(keyword_dict) keyword_processor.add_keywords_from_list(["java", "python"]) print(keyword_processor.extract_keywords('I am a product manager for a java_2e platform')) #Remove Keywords keyword_processor.remove_keyword('java_2e') # you can also remove keywords from a list/ dictionary keyword_processor.remove_keywords_from_dict({"product management": ["PM"]}) keyword_processor.remove_keywords_from_list(["java programing"]) print(keyword_processor.extract_keywords('I am a product manager for a java_2e platform')) #To check Number of terms in KeywordProcessor print(len(keyword_processor)) #To check if term is present in KeywordProcessor print('j2ee' in keyword_processor) #Get all keywords in dictionary print(keyword_processor.get_all_keywords()) |
Note: Samples used for demonstration were picked from here with only the difference of printing the outputs. You can check the APIs and some more samples from the same place.
You have read the quick overview of flashtext library, download this library from here and perform a high-performance search or replace keywords in your applications. Check out Python4Delphi and easily build Python GUIs for Windows using Delphi.
Design. Code. Compile. Deploy.
Start Free Trial Upgrade Today
Free Delphi Community Edition Free C++Builder Community Edition