Some of my works and projects in data science and signal processing
This is an implementation of a constraint satisfaction problem: given the structure of a crossword puzzle (i.e., which squares of the grid are meant to be filled in with a letter), and a list of words to use, the problem becomes one of choosing which words should go in each vertical or horizontal sequence of squares. It’s a optimization problem.
It was made as an activity for the Harvard’s Course CS50’s Introduction to Artificial Intelligence with python.
The program has several parts.
First, it check for each position, which words could fit in (the number of blocks equal to the number of words). This is called Node Consistency and it generates a domain (subsist of words) for each position.
After that, the Arc Consistency is checked, using the AC-3 Algorithm. In pairs, the words that do not have the same letter in the block where they intercepts are deleted from each domain.
Then, a Backtracking Search is implemented: within the remaining options,the IA choose one word and start to check other words to fit in the different spaces. If a point there is a error, the algorithm goes back and start over using different words.
It uses two .txt files: one from the structure (it has # indicating closed blocks and _ indicating open spaces) and other with the words (it’s just a list).
Command:
$ python generate.py [structure.txt] [words.txt]
Look in my repo, to download the files and run the program. In “data” there are some examples.
We can try with the structure2.txt and the words2.txt
Structure2. txt
######_
____##_
_##____
_##_##_
_##_##_
#___##_
List of words2.txt (fragment)
a
abandon
ability
able
abortion
about
above
abroad
absence
absolute
absolutely
absorb
abuse
academic
accept
[...]
yet
yield
you
young
your
yours
yourself
youth
zone
We run:
$ python generate.py structure2.txt words3.txt
And we get:
█████████
███████B█
█YEAR██E█
█A██EACH█
█R██L██I█
█D██A██N█
██BOX██D█
█████████
After that, we could delete the word “behind” from the file words2.txt, and run it again:
█████████
███████E█
█YEAR██T█
█A██EACH█
█R██L██I█
█D██A██C█
██BOX██S█
█████████
To test other posibilities, I download a random list of words (2500) from https://www.randomlists.com/. After that, I test the same puzzle with other words:
█████████
███████A█
█YEAR██B█
█A██EGGS█
█R██L██E█
█D██A██N█
██BOX██T█
█████████
It works!