文章中所有代码均不正确,初版代码,只展现想法 🤭
创建了读取文件的函数 LoadFile.go
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
| package loadfile
import( "bufio", "os", "basework/CheckErr", "fmt" )
func LoadFile2List(filename string) ([]string, error){ file, err := os.open(filename) CheckErr(err) file.close()
scanner := bufio.NewScanner(file)
var lines = []string
for scanner.Scan(){
line := scanner.Text()
if line != ""{ lines.append(line) }
if err := scanner.Err(); err != nil{ return nil, fmt.Errorf("error: %v",err) }
}
return lines, nil
}
|
主要用于逐行读取字典文件
之后整了个检查 error 的函数 CheckError.go
1 2 3 4 5 6 7 8 9 10 11
| package checkerr
import "fmt" import "os"
func CheckErr(err error){ if err != nil{ fmt.Sprintf("error: %v\n",err) os.Exit(1) } }
|
功能顾名思义
之后想整个函数自定 Transport,让 http.client 带上想要的请求头,这个操作需要自定一个结构实现 RoundTrip 接口,之后是用我们自己的结构体传入给 client 的 Transport,开始是这么做了,之后发现逻辑复杂了,感觉不用这么做(起码目前没发现这么做的意义)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
| package client
import( "net/http", "time" )
type HeaderTransport struct{ header http.Header base http.RoundTripper }
func (t *HeaderTransport) RoundTrip(req *http.Request) (*http.Response, error){
for key, value in range t.header{ if req.Header.Get(key) == ""{ for _, value1 in range value{ req.Header.Add(key, value1) } } }
baseTranslt := t.base if t.base == nil{ baseTranslt := http.DefaultTransport }
return baseTranslt.RoundTrip.(req) }
func NewClient(header http.Header, timeout time.Duration) *http.Client{
}
|
old 代码,不舍但是再见
换了个逻辑,现在写了个函数直接构造 request,再创建个函数用来发送请求
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
| package cgetreq
import( "net/http", "basework/checkerr" )
func CGETreq(finalpath string)(*http.Request, error){ req, err := http.NewRequest("GET", finalpath, nil) checkerr.CheckErr(err)
req.Header.Set("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/142.0.0.0 Safari/537.36 Edg/142.0.0.0")
return req, err }
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
| package sendmess
import( "fmt", "net/http", "basework/cgetreq" )
func SendMessage(client *http.Client, finalpath string){ req, err := cgetreq.CGETreq(finalpath)
if err != nil{ fmt.Printf("[error]:%v\n",err) return }
rep, err = client.Do(req)
if err != nil{ fmt.Printf("[error]:%v\n",err) return }
defer rep.Body.Close()
if rep.StatusCode != 404{ fmt.Printf("[found] %-6d %s",rep.StatusCode, finalpath) } }
|
显然现在的请求头只能写死,之后考虑动态加入
之后是并发发送请求的函数,这里使用了 goroutine,可能不是最快的方法,但水平就到这了,之后考虑再换
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
| package getscan
import( "fmt", "basework/loadfile", "basework/checkerr", "basework/urlconstruct", "basework/sendmess", "sync"
)
func Getscan(url string, filename string){ finalpath := urlconstruct.UrlConstruct(url, filename)
client := &http.Client{ Timeout: 3 * time.Second, }
var wg sync.WaitGroup
ratelimit := 1000
limitch := make(chan struct{}, ratelimit)
fmt.Printf("-----START-----")
for _, url in range finalpath{ limitch <- struct{}{}
wg.Add(1)
go func(url string){ defer wg.Done()
defer func(){<-limitch}()
sendmess.SendMessage(client, url)
}(url) }
wg.Wait()
fmt.Printf("-----OVER-----")
}
|
使用空结构体到信道的类型,不占用多余内存,以此限制并发速率避免直接干爆电脑,通过 goroutine 实现并发,创建一个 sync 的并发工具,wg.Wait()确保所用进程结束
到此,一个简单的扫路径的工具大致也是完成了
(改进行枯燥的 debug 环节,测试运行是否正常 😭)