MCPcopy
hub / github.com/klauspost/compress / encodeBlockBetterDict

Function encodeBlockBetterDict

s2/encode_better.go:902–1513  ·  view source on GitHub ↗

encodeBlockBetterDict encodes a non-empty src to a guaranteed-large-enough dst. It assumes that the varint-encoded length of the decompressed bytes has already been written. It also assumes that: len(dst) >= MaxEncodedLen(len(src)) && minNonLiteralBlockSize <= len(src) && len(src) <= maxBlockSiz

(dst, src []byte, dict *Dict)

Source from the content-addressed store, hash-verified

900// len(dst) >= MaxEncodedLen(len(src)) &&
901// minNonLiteralBlockSize <= len(src) && len(src) <= maxBlockSize
902func encodeBlockBetterDict(dst, src []byte, dict *Dict) (d int) {
903 // sLimit is when to stop looking for offset/length copies. The inputMargin
904 // lets us use a fast path for emitLiteral in the main loop, while we are
905 // looking for copies.
906 // Initialize the hash tables.
907 const (
908 // Long hash matches.
909 lTableBits = betterLongTableBits
910 maxLTableSize = betterLongTableSize
911
912 // Short hash matches.
913 sTableBits = betterShortTableBits
914 maxSTableSize = betterShortTableSize
915
916 maxAhead = 8 // maximum bytes ahead without checking sLimit
917
918 debug = false
919 )
920
921 sLimit := min(len(src)-inputMargin, MaxDictSrcOffset-maxAhead)
922 if len(src) < minNonLiteralBlockSize {
923 return 0
924 }
925
926 dict.initBetter()
927
928 tbl := getBetterTables()
929 lTable := &tbl.lTable
930 sTable := &tbl.sTable
931 defer betterTablePool.Put(tbl)
932
933 // Bail if we can't compress to at least this.
934 dstLimit := len(src) - len(src)>>5 - 6
935
936 // nextEmit is where in src the next emitLiteral should start from.
937 nextEmit := 0
938
939 // The encoded form must start with a literal, as there are no previous
940 // bytes to copy, so we start looking for hash matches at s == 1.
941 s := 0
942 cv := load64(src, s)
943
944 // We initialize repeat to 0, so we never match on first attempt
945 repeat := len(dict.dict) - dict.repeat
946
947 // While in dict
948searchDict:
949 for {
950 candidateL := 0
951 nextS := 0
952 for {
953 // Next src position to check
954 nextS = s + (s-nextEmit)>>7 + 1
955 if nextS > sLimit {
956 break searchDict
957 }
958 hashL := hash7(cv, lTableBits)
959 hashS := hash4(cv, sTableBits)

Callers 3

TestDictBetterFunction · 0.85
TestDictBetter2Function · 0.85
EncodeBetterMethod · 0.85

Calls 9

getBetterTablesFunction · 0.85
initBetterMethod · 0.80
load64Function · 0.70
hash7Function · 0.70
hash4Function · 0.70
load32Function · 0.70
emitLiteralFunction · 0.70
emitRepeatFunction · 0.70
emitCopyFunction · 0.70

Tested by 2

TestDictBetterFunction · 0.68
TestDictBetter2Function · 0.68

Used in the wild real call sites across dependent graphs

searching dependent graphs…