README.md 中文版
_ _ _
_ __ _ _ __| (_) ___| |_ ___ _ __
| '_ \| | | |/ _` | |/ __| __/ _ \| '__|
| |_) | |_| | (_| | | (__| || (_) | |
| .__/ \__, |\__,_|_|\___|\__\___/|_|
|_| |___/
-
Q: Why I need to use pydictor ?
A: 1.it always can help you
You can use pydictor to generate a general blast wordlist, a custom wordlist based on Web content, a social engineering wordlist, and so on;
You can use the pydictor built-in tool to safe delete, merge, unique, merge and unique, count word frequency to filter the wordlist,
besides, you also can specify your wordlist and use '-tool handler' to filter your wordlist;
2.highly customized
You can generate highly customized and complex wordlist by modify multiple configuration files,
add your own dictionary, using leet mode, filter by length、char occur times、types of different char、regex,
even customized own encryption function by modify /lib/fun/encode.py test_encode function.
its very relevant to generate good or bad password wordlist with your customized rules and skilled use of pydictor;
3.powerful and flexible configuration file parsing
nothing to say,skilled use and you will love it
4.great compatibility
whether you are using Python 2.7 version or Python 3.x version , pydictor can be run on Windows, Linux or Mac;
git clone --depth=1 --branch=master https://www.github.com/landgrey/pydictor.git
cd pydictor/
chmod 755 pydictor.py
python pydictor.py


| wordlist type | number | description |
|---|---|---|
| base | 1 | basic wordlist |
| char | 2 | custom character wordlist |
| chunk | 3 | permutation and combination wordlist |
| conf | 4 | based on configuration file wordlist |
| sedb | 5 | social engineering wordlist |
| idcard | 6 | id card last 4/6/8 char wordlist |
| extend | 7 | extend wordlist based on rules |
| scratch | 8 | wordlist based on web pages keywords |
| passcraper | 9 | wordlist against to web admin and users |
| handler | 10 | handle the input file generate wordlist |
| uniqifer | 11 | unique the input file and generate wordlist |
| counter | 12 | word frequency count wordlist |
| combiner | 13 | combine the input file generate wordlist |
| uniqbiner | 14 | combine and unique the input file generate wordlist |
| birthday | 15 | birthday keyword wordlist in specify datetime scope |
| function | number (wordlist) | description |
|---|---|---|
| len | 1 2 3 4 5 6 7 9 10 11 12 14 15 | lenght scope |
| head | 1 2 3 4 5 6 7 9 10 11 12 14 15 | add items prefix |
| tail | 1 2 3 4 5 6 7 9 10 11 12 14 15 | add items suffix |
| encode | 1 2 3 4 5 6 7 9 10 11 12 14 15 | encode the items |
| occur | 3 4 5 7 9 10 11 12 14 | filter by occur times of letter、digital、special chars |
| types | 3 4 5 7 9 10 11 12 14 | filter by types of letter、digital、special chars |
| regex | 3 4 5 7 9 10 11 12 14 | filter by regex |
| level | 5 7 9 | set the wordlist level |
| leet | 5 7 9 | 1337 mode |
python pydictor.py -base d --len 4 4 --output D:\exists\or\not\dict.txt
python pydictor.py -base L --len 1 3 --encode b64
python pydictor.py -base dLc -o /awesome/pwd
python pydictor.py -char "abc123._@ " --len 1 3 --tail @site
python pydictor.py -chunk abc ABC 666 . _ @ "'" --head a --tail 123 --encode md5
write the following information to '/names.txt'
liwell
shelly
bianji
webzhang
run command
python pydictor.py -extend /names.txt --leet 0 1 2 11 21 --level 1 --len 4 16 --occur "<=10" ">0" "<=2" -o /possbile/wordlist.lst
pydictor.py -plug pid6 --types ">=0" ">=4" ">=0" --encode b64
note: default sex ='all', it decided by lib/data/data.py default_sex, and 'm' is Male, 'f' is Female
pydictor.py -plug birthday 19750101 20001231 --len 6 8
python pydictor.py -plug passcraper using default file scraper.sites as multi-input file
python pydictor.py -plug passcraper http://www.example.com
pydictor.py --conf "[1-9]{6,6}<none>" --output six.txt build wordlist
python pydictor.py --conf using default file funcfg/build.conf build the dictionary
python pydictor.py --conf /my/other/awesome.conf using /my/other/awesome.conf build the dictionary
note: parsing rules details as following,besides referred to build.conf file
1. the basic unit of parsing is called an parsing element, an parsing element includes five elements, namely: head, character set, length range, encoding, tail, which can be omitted both head and tail;
A standard parsing element:head[characters]{minlength,maxlength}<encode-type>tail,a example parsing element:a[0-9]{4,6}<none>_
Its meaning build a dictionary that prefix is "a" , character set is 0—9, don't encode,length range is 4—6 and suffix is "_"
2. current is support parsing one line
3. one line can contains 10 parsing elements
such as:[4-6,a-c,A,C,admin]{3,3}<none>_[a,s,d,f]{2,2}<none>[789,!@#]{1,2}<none>,it contains three parsing elements
4. if annotator "#" in first place, program won't parse this line
5. conf function can build more precise dictionary up to single char
about character sets:
You can add the "-" in the middle of character sets beginning and ending to join them
and can also use "," to separate multiple character sets, or a single character, or a single string, as an element of the character set;
supported encoding:
none don't encode
b64 base64
md5 md5 digest algorithm output 32 char
md516 md5 digest algorithm output 16 char
sha1 sha1 digest algorithm
url urlencode
sha256 sha256 digest algorithm
sha512 sha512 digest algorithm
test interface for customized encode function
specify the input file, and output the handled file
python pydictor.py -tool handler /wordlist/raw.txt --len 6 16 --occur "" "=6" "<0" --encode b64 -o /wordlist/ok.txt
python pydictor.py -tool shredder delete the currently specified output path(default:results) files and all its dictionary files
python pydictor.py -tool shredder base delete the files of it's prefix is "BASE" in currently specified output path
prefix(case insensitive) range in 15 items: base,char,chunk,conf,sedb,idcard,extend,handler,uniqifer,counter,combiner,uniqbiner,scratch,passcraper,birthday
besides,you can safe shred files or whole directory as following:
python pydictor.py -tool shredder /data/mess
python pydictor.py -tool shredder D:\mess\1.zip
for improving the security delete speed, the default uses 1 times to erase and rewrite,you can modify lib/data/data.py file's file_rewrite_count and dir_rewrite_count value
python pydictor.py -tool uniqifer /tmp/my.dic
python pydictor.py -tool counter vs /tmp/mess.txt 100 select 100 words in /tmp/mess.txt file that appear in the most times and output to the terminal and saved to file
note: default choose 100 items to print or save;default separator is:"\n",you can modify counter_split value in lib/data/data.py file
python pydictor.py -tool combiner /my/messdir
python pydictor.py -tool uniqbiner /my/messdir
python pydictor.py -extend bob adam sarah --level 5
leet char = replace char
a = 4
b = 6
e = 3
l = 1
i = 1
o = 0
s = 5
0 default,replace all
1 left-to-right, replace all the first encountered leet char
2 right-to-left, replace all the first encountered leet char
11-19 left-to-right, replace the first encountered leet char to maximum code-10 chars
21-29 right-to-left, replace the first encountered leet char to maximum code-20 chars
| code | old string | new string |
|---|---|---|
| 0 | as a airs trees | 45 4 41r5 tr335 |
| 1 | as a airs trees | 4s 4 4irs trees |
| 2 | as a airs trees | a5 a air5 tree5 |
| 11 | as a airs trees | 4s a airs trees |
| 12 | as a airs trees | 4s 4 airs trees |
| 13 | as a airs trees | 4s 4 4irs trees |
| 14 | as a airs trees | 4s 4 4irs trees |
| ... | as a airs trees | 4s 4 4irs trees |
| 21 | as a airs trees | as a airs tree5 |
| 22 | as a airs trees | as a air5 tree5 |
| 23 | as a airs trees | a5 a air5 tree5 |
| 24 | as a airs trees | a5 a air5 tree5 |
| ... | as a airs trees | a5 a air5 tree5 |
besides,you also can:
modify /funcfg/leet_mode.conf, add or delete leet table items;
modify /lib/lib/data.py, extend_leet、passcraper_leet、sedb_leet arguments, choose some functions whether default use leet mode;
modify /lib/data/data.py,leet_mode_code argument, choose default mode code;
--occur [scope of occur letter times] [scope of occur digital times] [scope of occur special chars times]
default occur times
``` "<=99" "
$ claude mcp add pydictor \
-- python -m otcore.mcp_server <graph>