sqlmap源码分析(一)

简单的读了下源码,属实难顶,只大概了解了流程😭

无参数

从入口文件进行分析

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
if __name__ == "__main__":
try:
main()
except KeyboardInterrupt:
pass
except SystemExit:
raise
except:
traceback.print_exc()
finally:
# Reference: http://stackoverflow.com/questions/1635080/terminate-a-multi-thread-python-program
if threading.activeCount() > 1:
os._exit(getattr(os, "_exitcode", 0))
else:
sys.exit(getattr(os, "_exitcode", 0))
else:
# cancelling postponed imports (because of Travis CI checks)
from lib.controller.controller import start

直接看main函数

1
2
3
4
5
6
........
dirtyPatches()
resolveCrossReferences()
checkEnvironment()
setPaths(modulePath())
banner()

上来就是四个函数,检查了sqlmap运行的环境,同时设置了路径之类的

继续往下看就是参数的读取注册了

1
2
3
4
# Store original command line options for possible later restoration
args = cmdLineParser()
cmdLineOptions.update(args.__dict__ if hasattr(args, "__dict__") else args)
initOptions(cmdLineOptions)

直接使用Pycharm进行Debug,跟进cmdLineParser()由于没设置参数,可以看到argv的值为

image-20200119164008396

往下走会有一个函数

1
checkSystemEncoding()

主要作用就是设置系统编码格式,继续往下走

1
2
3
_ = getUnicode(os.path.basename(argv[0]), encoding=sys.stdin.encoding) #sqlmap.py
usage = "%s%s [options]" % ("%s " % os.path.basename(sys.executable) if not IS_WIN else "", "\"%s\"" % _ if " " in _ else _) #sqlmap.py [options]
parser = ArgumentParser(usage=usage)

接下来就是读取参数啦,详情可以查看 argparse 相关内容

1
2
3
4
5
6
7
8
9
10
……   
parser.add_argument("--hh", dest="advancedHelp", action="store_true",
help="Show advanced help message and exit")

parser.add_argument("--version", dest="showVersion", action="store_true",
help="Show program's version number and exit")

parser.add_argument("-v", dest="verbose", type=int,
help="Verbosity level: 0-6 (default %d)" % defaults.verbose)
……

因为没有设置参数,所以这里直接可以跳过了,继续往下

1
2
3
4
5
6
7
8
9
10
11
12
13
def _(self, *args):
retVal = parser.formatter._format_option_strings(*args)
if len(retVal) > MAX_HELP_OPTION_LENGTH:
retVal = ("%%.%ds.." % (MAX_HELP_OPTION_LENGTH - parser.formatter.indent_increment)) % retVal
return retVal

parser.formatter._format_option_strings = parser.formatter.format_option_strings
parser.formatter.format_option_strings = type(parser.formatter.format_option_strings)(_, parser)
# Dirty hack for inherent help message of switch '-h'
if hasattr(parser, "get_option"):
option = parser.get_option("-h")
option.help = option.help.capitalize().replace("this help", "basic help")
……

这里主要作用也是对格式的处理,在往下就到了对各参数的判断

1
2
3
4
5
6
7
8
 for i in xrange(len(argv)):
longOptions = set(re.findall(r"\-\-([^= ]+?)=", parser.format_help()))
longSwitches = set(re.findall(r"\-\-([^= ]+?)\s", parser.format_help()))
if argv[i] == "-hh":
argv[i] = "-h"
elif i == 1 and re.search(r"\A(http|www\.|\w[\w.-]+\.\w{2,})", argv[i]) is not None:
argv[i] = "--url=%s" % argv[i]
省略省略省略

对于没有参数的情况当然是跳过这部分咯

最后直接到达这里

1
2
3
4
if not any((args.direct, args.url, args.logFile, args.bulkFile, args.googleDork, args.configFile, args.requestFile, args.updateAll, args.smokeTest, args.vulnTest, args.fuzzTest, args.wizard, args.dependencies, args.purge, args.listTampers, args.hashFile)):
errMsg = "missing a mandatory option (-d, -u, -l, -m, -r, -g, -c, --list-tampers, --wizard, --update, --purge or --dependencies). "
errMsg += "Use -h for basic and -hh for advanced help\n"
parser.error(errMsg)

因为没有任何参数,所以输出信息后退出

image-20200119165939072

URL

这次加入URL进行分析,需要注意的是要在Configuration的Parameters中添加-u "http://localhost/?a="

image-20200119170119176

在本地的index.php中是如下文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
<?php
include 'config.php';
$table = $_GET[a];
$conn = new mysqli($servername, $username, $password,$dbname);

$sql = "SELECT * FROM ns_member where uid=$table";
echo $sql;
$result = $conn->query($sql);

if ($result->num_rows > 0) {
while($row = $result->fetch_assoc()) {
echo $row['member_name'];
}
}
$conn->close();
?>

很显然存在注入,Debug看看,这次直接把断点加在

1
parser = ArgumentParser(usage=usage)
1
2
3
4
5
# Target options
target = parser.add_argument_group("Target", "At least one of these options has to be provided to define the target(s)")

target.add_argument("-u", "--url", dest="url",
help="Target URL (e.g. \"http://www.site.com/vuln.php?id=1\")")

image-20200119170935138

一路向下直到return

1
2
3
args = cmdLineParser()
cmdLineOptions.update(args.__dict__ if hasattr(args, "__dict__") else args)
initOptions(cmdLineOptions)

这里的args内容就如下所示了

image-20200119171212658

只有URL有值,继续往下运行,此时出现了conf这个变量

1
2
3
conf.showTime = True
dataToStdout("[!] legal disclaimer: %s\n\n" % LEGAL_DISCLAIMER, forceOutput=True)
dataToStdout("[*] starting @ %s\n\n" % time.strftime("%X /%Y-%m-%d/"), forceOutput=True)

但是不知道为什么并没有显示出来,有大佬知道的话还请赐教🙏

image-20200119182348816

我采用了Evaluate的方法

image-20200119182457956

往下走是一个init(),跟进去看看

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
def init():
"""
Set attributes into both configuration and knowledge base singletons
based upon command line and configuration file options.
"""

_useWizardInterface()
setVerbosity()
_saveConfig()
_setRequestFromFile()
_cleanupOptions()
_cleanupEnvironment()
_purge()
_checkDependencies()
_createHomeDirectories()
_createTemporaryDirectory()
_basicOptionValidation()
_setProxyList()
_setTorProxySettings()
_setDNSServer()
_adjustLoggingFormatter()
_setMultipleTargets()
_listTamperingFunctions()
_setTamperingFunctions()
_setPreprocessFunctions()
_setTrafficOutputFP()
_setupHTTPCollector()
_setHttpChunked()
_checkWebSocket()
........

是一些配置的读取,看_createHomeDirectories函数

1
2
3
4
5
6
7
8
9
for context in "output", "history":
directory = paths["SQLMAP_%s_PATH" % context.upper()]
try:
if not os.path.isdir(directory):
os.makedirs(directory)

_ = os.path.join(directory, randomStr())
open(_, "w+b").close()
os.remove(_)

这里设置了output以及history目录的路径

image-20200119184120596

还有_listTamperingFunctions_setTamperingFunctions 函数,应该和tamper有关,但我们这里没有,先记下来

image-20200119184800602

image-20200119184900429

此外还有_setHttpChunked貌似可以设置chunked方式

image-20200119185028861

接下来是设置HTTP请求的方法

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
_setHostname()
_setHTTPTimeout()
_setHTTPExtraHeaders()
_setHTTPCookies()
_setHTTPReferer()
_setHTTPHost()
_setHTTPUserAgent()
_setHTTPAuthentication()
_setHTTPHandlers()
_setDNSCache()
_setSocketPreConnect()
_setSafeVisit()
_doSearch()
_setBulkMultipleTargets()
_checkTor()
_setCrawler()
_findPageForms()
_setDBMS()
_setTechnique()

_setHTTPUserAgent函数增加了sqlmap的User-Agent

image-20200119185539633

往下就是加载Payload之类的函数了

1
2
3
4
5
6
7
8
9
10
_setThreads()
_setOS()
_setWriteFile()
_setMetasploit()
_setDBMSAuthentication()
loadBoundaries()
loadPayloads()
_setPrefixSuffix()
update()
_loadQueries()

看下loadPayloads

1
2
for payloadFile in PAYLOAD_XML_FILES:
payloadFilePath = os.path.join(paths.SQLMAP_XML_PAYLOADS_PATH, payloadFile)

其中

1
PAYLOAD_XML_FILES = ("boolean_blind.xml", "error_based.xml", "inline_query.xml", "stacked_queries.xml", "time_blind.xml", "union_query.xml")

即payloadFiled在sqlmap安装目录下的data/xml/payloads/文件夹下,我们可以看下内容

image-20200119190725348

最后是_loadQueries

1
2
3
"""
Loads queries from 'xml/queries.xml' file.
"""

xml/queries.xml

image-20200119191114242

都是注入的语句,init()到此结束,继续往下走到第212行的start()

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
def stackedmethod(f):
"""
Method using pushValue/popValue functions (fallback function for stack realignment)

>>> threadData = getCurrentThreadData()
>>> original = len(threadData.valueStack)
>>> __ = stackedmethod(lambda _: threadData.valueStack.append(_))
>>> __(1)
>>> len(threadData.valueStack) == original
True
"""

@functools.wraps(f)
def _(*args, **kwargs):
threadData = getCurrentThreadData()
originalLevel = len(threadData.valueStack)

try:
result = f(*args, **kwargs)

跟进result,经过一系列对格式的处理后来到第428行

1
setupTargetEnv()

跟进

1
2
3
4
5
6
7
8
def setupTargetEnv():
_createTargetDirs()
_setRequestParams()
_setHashDB()
_resumeHashDBValues()
_setResultsFile()
_setAuthCred()
_setAuxOptions()

_createTargetDirs创建一个output/domain的文件夹

1
conf.outputPath = os.path.join(getUnicode(paths.SQLMAP_OUTPUT_PATH), normalizeUnicode(getUnicode(conf.hostname)))

并且写入一个target.txt文件

1
2
3
4
5
6
with openFile(os.path.join(conf.outputPath, "target.txt"), "w+") as f:
f.write(kb.originalUrls.get(conf.url) or conf.url or conf.hostname)
f.write(" (%s)" % (HTTPMETHOD.POST if conf.data else HTTPMETHOD.GET))
f.write(" # %s" % getUnicode(subprocess.list2cmdline(sys.argv), encoding=sys.stdin.encoding))
if conf.data:
f.write("\n\n%s" % getUnicode(conf.data))

如果读取数据还会创建别的文件

1
2
3
_createDumpDir()
_createFilesDir()
_configureDumper()

往下走到target.py的264行

1
2
3
4
5
6
7
_ = re.sub(PROBLEMATIC_CUSTOM_INJECTION_PATTERNS, "", value or "") if place == PLACE.CUSTOM_HEADER else value or ""
if kb.customInjectionMark in _:
if kb.processUserMarks is None:
lut = {PLACE.URI: '-u', PLACE.CUSTOM_POST: '--data', PLACE.CUSTOM_HEADER: '--headers/--user-agent/--referer/--cookie'}
message = "custom injection marker ('%s') found in option " % kb.customInjectionMark
message += "'%s'. Do you want to process it? [Y/n/q] " % lut[place]
choice = readInput(message, default='Y').upper()

这里的if判断了参数里是否有*

_setHashDB创建了Session文件,可以记录扫描过的域名的情况,文件位置就在域名文件夹下的session.sqlite

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
def _setHashDB():
"""
Check and set the HashDB SQLite file for query resume functionality.
"""

if not conf.hashDBFile:
conf.hashDBFile = conf.sessionFile or os.path.join(conf.outputPath, SESSION_SQLITE_FILE)

if os.path.exists(conf.hashDBFile):
if conf.flushSession:
try:
os.remove(conf.hashDBFile)
logger.info("flushing session file")
except OSError as ex:
errMsg = "unable to flush the session file ('%s')" % getSafeExString(ex)
raise SqlmapFilePathException(errMsg)

conf.hashDB = HashDB(conf.hashDBFile)

接下来的_resumeHashDBValues就是恢复session

往下来到controller.py的第430行

1
2
if not checkConnection(suppressOutput=conf.forms) or not checkString() or not checkRegexp():
continue

跟进checkConnection,在connect.py中首先检测了网站是否能访问,并且将访问获取的信息保存在kb中,第1355行

1
2
3
4
5
6
7
8
9
10
if pageLength is None:
try:
page, headers, code = Connect.getPage(url=uri, get=get, post=post, method=method, cookie=cookie, ua=ua, referer=referer, host=host, silent=silent, auxHeaders=auxHeaders, response=response, raise404=raise404, ignoreTimeout=timeBasedCompare)
except MemoryError:
page, headers, code = None, None, None
warnMsg = "site returned insanely large response"
if kb.testMode:
warnMsg += " in testing phase. This is a common "
warnMsg += "behavior in custom WAF/IPS solutions"
singleTimeWarnMessage(warnMsg)

之后回到442行的checkwaf,探测了是否有waf,不过看代码不像有什么检测的样子

继续到452行的checkStability(),第二次访问网页

1
secondPage, _, _ = Request.queryPage(content=True, noteResponseTime=False, raise404=False)

两次信息都会记录下来

image-20200119195418746

接下来继续运行到controller.py的第551行

1
check = checkDynParam(place, parameter, value)

跟进checkDynParam,还是在checks.py,这次是在第1153行

1
2
3
4
5
try:
payload = agent.payload(place, parameter, value, getUnicode(randInt))
dynResult = Request.queryPage(payload, place, raise404=False)
except SqlmapConnectionException:
pass

最后检测是否是数字型的sql点,之后就进入了Payload环节,后续过长就单单选择一个进行分析

Controler.py第574行

1
check = heuristicCheckSqlInjection(place, parameter)

跟进heuristicCheckSqlInjection

1
2
while randStr.count('\'') != 1 or randStr.count('\"') != 1:
randStr = randomStr(length=10, alphabet=HEURISTIC_CHECK_ALPHABET)

这里随机生成了一个十位数的字符串,而字符是在

1
HEURISTIC_CHECK_ALPHABET = ('"', '\'', ')', '(', ',', '.')

中选择,randomStr函数

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
def randomStr(length=4, lowercase=False, alphabet=None, seed=None):
if seed is not None:
_ = getCurrentThreadData().random
_.seed(seed)
choice = _.choice
else:
choice = random.choice

if alphabet:
retVal = "".join(choice(alphabet) for _ in xrange(0, length))
elif lowercase:
retVal = "".join(choice(string.ascii_lowercase) for _ in xrange(0, length))
else:
retVal = "".join(choice(string.ascii_letters) for _ in xrange(0, length))

return retVal

之后Payload会被替换后传入Requests

1
2
3
payload = "%s%s%s" % (prefix, randStr, suffix)
payload = agent.payload(place, parameter, newValue=payload)
page, _, _ = Request.queryPage(payload, place, content=True, raise404=False)

image-20200119202701458

可以看到payload中有些奇怪的字符串

1
__PAYLOAD__DELIMITER__

这些相当于占位符,回在后续请求中替换掉,之后的请求就类似于此了

image-20200119203036846