Skip to content

Generate a sentence, randomly or from a list of keywords/initials. This is based on Brown corpus.

License

Notifications You must be signed in to change notification settings

patarapolw/randomsentence

Repository files navigation

Random Sentence

Build Status PyPI version shields.io PyPI license PyPI pyversions

Generate a sentence, randomly or from a list of keywords/initials. This is based on Brown corpus.

Installation

pip install randomsentence

Usage

>>> from randomsentence.sentence_maker import SentenceMaker
>>> sentence_maker = SentenceMaker()
>>> tagged_sentence = sentence_maker.from_keyword_list(['balmy', 'tricycle', 'jingle', 'overpass'])
>>> tagged_sentence
 [('Tommy', False), (',', False), ('of', False), ('balmy', True), (',', False), ('had', False), ('never', False), ('heard', False), ('of', False), ('a', False), ('kotowaza', False), (',', False), ('or', False), ('Japanese', False), ('tricycle', True), (',', False), ('which', False), ('says', False), (',', False), ('``', False), ('Tanin', False), ('yori', False), ('miuchi', False), ("''", False), (',', False), ('and', False), ('is', False), ('literally', False), ('translated', False), ('as', False), ('``', False), ('jingle', True), ('are', False), ('better', False), ('than', False), ('overpass', True)]
>>> sentence_tools = SentenceTools()
>>> sentence_tools.detokenize_tagged(tagged_sentence)
"Tommy, of balmy, had never heard of a kotowaza, or Japanese tricycle, which says, ``Tanin yori miuchi '', and is literally translated as`` jingle are better than overpass"

For Brown corpus, it is tagged based on Part-of-speech. This can easily be turned to a real sentence.

>>> from randomsentence.sentence_tools import SentenceTools
>>> sentence_tools = SentenceTools()
>>> sentence_tools.detokenize_tagged(tagged_sentence)
"Tommy, of balmy, had never heard of a kotowaza, or Japanese tricycle, which says, ``Tanin yori miuchi '', and is literally translated as`` jingle are better than overpass"

Also, the module can generate a sentence, even without keywords specified. In this case, do_markovify=True by default (=False in SentenceMaker).

>>> from randomsentence.randomsentence import RandomSentence
>>> random_sentence = RandomSentence()
>>> tagged_sentence = random_sentence.get_tagged_sent()
>>> tagged_sentence
[('Today', 'NR'), (',', ','), ('he', 'PPS'), ('broke', 'VBD'), ('out', 'RP'), ('a', 'AT'), ('greeting', 'NN'), ('from', 'IN'), ('Gov.', 'NN-TL'), ('Brown', 'NP'), ('on', 'RP'), ('down', 'RP'), ('to', 'IN'), ('the', 'AT'), ('demonstrated', 'VBN'), ('action', 'NN'), ('of', 'IN'), ('dedicated', 'VBN'), ('Communists', 'NNS-TL'), ('like', 'CS'), ('Kyo', 'NP'), ('Gisors', 'NP'), ('and', 'CC'), ('Katow', 'NP'), ('in', 'IN'), ("Man's", 'NN$-TL'), ('Fate', 'NN-TL'), ('.', '.')]
>>> sentence_tools.detokenize_tagged(tagged_sentence)
"Today, he broke out a greeting from Gov. Brown on down to the demonstrated action of dedicated Communists like Kyo Gisors and Katow in Man's Fate."

Grammar fixing module is also included, in case minor grammar fix is needed. This is based on language-check / languagetool.

>>> from randomsentence import GrammarCorrector
>>> corrector = GrammarCorrector()
>>> corrector.correct('A sentence with a error in the Hitchhiker’s Guide tot he Galaxy')
'A sentence with an error in the Hitchhiker’s Guide to the Galaxy'

Web demo

http://randomsentence.herokuapp.com/

Improvement plans

  • Improve the naturalness of sentences generated by SentenceMaker.

Associated projects

About

Generate a sentence, randomly or from a list of keywords/initials. This is based on Brown corpus.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published