MCPcopy Index your code
hub / github.com/kpdecker/jsdiff

github.com/kpdecker/jsdiff @v9.0.0

repository ↗ · DeepWiki ↗ · release v9.0.0 ↗ · Ask this repo → · + Follow
169 symbols 366 edges 44 files 8 documented · 5% updated 33d ago★ 9,17214 open issues
README

jsdiff

A JavaScript text differencing implementation. Try it out in the online demo.

Based on the algorithm proposed in "An O(ND) Difference Algorithm and its Variations" (Myers, 1986).

Installation

npm install diff --save

Getting started

Imports

In an environment where you can use imports, everything you need can be imported directly from diff. e.g.

ESM:

import {diffChars, createPatch} from 'diff';

CommonJS

const {diffChars, createPatch} = require('diff');

If you want to serve jsdiff to a web page without using a module system, you can use dist/diff.js or dist/diff.min.js. These create a global called Diff that contains the entire JsDiff API as its properties.

Usage

jsdiff's diff functions all take an old text and a new text and perform three steps:

  1. Split both texts into arrays of "tokens". What constitutes a token varies; in diffChars, each character is a token, while in diffLines, each line is a token.

  2. Find the smallest set of single-token insertions and deletions needed to transform the first array of tokens into the second.

This step depends upon having some notion of a token from the old array being "equal" to one from the new array, and this notion of equality affects the results. Usually two tokens are equal if === considers them equal, but some of the diff functions use an alternative notion of equality or have options to configure it. For instance, by default diffChars("Foo", "FOOD") will require two deletions (o, o) and three insertions (O, O, D), but diffChars("Foo", "FOOD", {ignoreCase: true}) will require just one insertion (of a D), since ignoreCase causes o and O to be considered equal.

  1. Return an array representing the transformation computed in the previous step as a series of change objects. The array is ordered from the start of the input to the end, and each change object represents inserting one or more tokens, deleting one or more tokens, or keeping one or more tokens.

API

  • diffChars(oldStr, newStr[, options]) - diffs two blocks of text, treating each character as a token.

    ("Characters" here means Unicode code points - the elements you get when you loop over a string with a for ... of ... loop.)

    Returns a list of change objects.

    Options * ignoreCase: If true, the uppercase and lowercase forms of a character are considered equal. Defaults to false.

  • diffWords(oldStr, newStr[, options]) - diffs two blocks of text, treating each word and each punctuation mark as a token. Whitespace is ignored when computing the diff (but preserved as far as possible in the final change objects).

    Returns a list of change objects.

    Options * ignoreCase: Same as in diffChars. Defaults to false. * intlSegmenter: An optional Intl.Segmenter object (which must have a granularity of 'word') for diffWords to use to split the text into words.

    By default, diffWords does not use an Intl.Segmenter, just some regexes for splitting text into words. This will tend to give worse results than Intl.Segmenter would, but ensures the results are consistent across environments; Intl.Segmenter behaviour is only loosely specced and the implementations in browsers could in principle change dramatically in future. If you want to use diffWords with an Intl.Segmenter but ensure it behaves the same whatever environment you run it in, use an Intl.Segmenter polyfill instead of the JavaScript engine's native Intl.Segmenter implementation.

    Using an Intl.Segmenter should allow better word-level diffing of non-English text than the default behaviour. For instance, Intl.Segmenters can generally identify via built-in dictionaries which sequences of adjacent Chinese characters form words, allowing word-level diffing of Chinese. By specifying a language when instantiating the segmenter (e.g. new Intl.Segmenter('sv', {granularity: 'word'})) you can also support language-specific rules, like treating Swedish's colon separated contractions (like k:a for kyrka) as single words; by default this would be seen as two words separated by a colon.

  • diffWordsWithSpace(oldStr, newStr[, options]) - diffs two blocks of text, treating each word, punctuation mark, newline, or run of (non-newline) whitespace as a token.

  • diffLines(oldStr, newStr[, options]) - diffs two blocks of text, treating each line as a token.

    Options * ignoreWhitespace: true to ignore leading and trailing whitespace characters when checking if two lines are equal. Defaults to false. * ignoreNewlineAtEof: true to ignore a missing newline character at the end of the last line when comparing it to other lines. (By default, the line 'b\n' in text 'a\nb\nc' is not considered equal to the line 'b' in text 'a\nb'; this option makes them be considered equal.) Ignored if ignoreWhitespace or newlineIsToken are also true. * stripTrailingCr: true to remove all trailing CR (\r) characters before performing the diff. Defaults to false. This helps to get a useful diff when diffing UNIX text files against Windows text files. * newlineIsToken: true to treat the newline character at the end of each line as its own token. This allows for changes to the newline structure to occur independently of the line content and to be treated as such. In general this is the more human friendly form of diffLines; the default behavior with this option turned off is better suited for patches and other computer friendly output. Defaults to false.

    Note that while using ignoreWhitespace in combination with newlineIsToken is not an error, results may not be as expected. With ignoreWhitespace: true and newlineIsToken: false, changing a completely empty line to contain some spaces is treated as a non-change, but with ignoreWhitespace: true and newlineIsToken: true, it is treated as an insertion. This is because the content of a completely blank line is not a token at all in newlineIsToken mode.

    Returns a list of change objects.

  • diffSentences(oldStr, newStr[, options]) - diffs two blocks of text, treating each sentence, and the whitespace between each pair of sentences, as a token. The characters ., !, and ?, when followed by whitespace, are treated as marking the end of a sentence; nothing else besides the end of the string is considered to mark a sentence end.

(For more sophisticated detection of sentence breaks, including support for non-English punctuation, consider instead tokenizing with an Intl.Segmenter with granularity: 'sentence' and passing the result to diffArrays.)

Returns a list of [change objects](#change-objects).
  • diffCss(oldStr, newStr[, options]) - diffs two blocks of text, comparing CSS tokens.

    Returns a list of change objects.

  • diffJson(oldObj, newObj[, options]) - diffs two JSON-serializable objects by first serializing them to prettily-formatted JSON and then treating each line of the JSON as a token. Object properties are ordered alphabetically in the serialized JSON, so the order of properties in the objects being compared doesn't affect the result.

    Returns a list of change objects.

    Options * stringifyReplacer: A custom replacer function. Operates similarly to the replacer parameter to JSON.stringify(), but must be a function. * undefinedReplacement: A value to replace undefined with. Ignored if a stringifyReplacer is provided.

  • diffArrays(oldArr, newArr[, options]) - diffs two arrays of tokens, comparing each item for strict equality (===).

    Options * comparator: function(left, right) for custom equality checks

    Returns a list of change objects.

  • createTwoFilesPatch(oldFileName, newFileName, oldStr, newStr[, oldHeader[, newHeader[, options]]]) - creates a unified diff patch by first computing a diff with diffLines and then serializing it to unified diff format.

    Parameters: * oldFileName: String to be output in the filename section of the patch for the removals * newFileName: String to be output in the filename section of the patch for the additions * oldStr: Original string value * newStr: New string value * oldHeader: Optional additional information to include in the old file header. Default: undefined. * newHeader: Optional additional information to include in the new file header. Default: undefined. * options: An object with options. - context: describes how many lines of context should be included. You can set this to Number.MAX_SAFE_INTEGER or Infinity to include the entire file content in one hunk. - ignoreWhitespace: Same as in diffLines. Defaults to false. - stripTrailingCr: Same as in diffLines. Defaults to false. - headerOptions: Configures the format of patch headers in the returned patch. (Note these are distinct from hunk headers, which are a mandatory part of the unified diff format and not configurable.) Has three subfields (all default to true): - includeIndex: whether to include a line like Index: filename.txt at the start of the patch header. (Even if this is true, this line will be omitted if oldFileName and newFileName are not identical.) - includeUnderline: whether to include ===================================================================. - includeFileHeaders: whether to include two lines indicating the old and new filename, formatted like --- old.txt and +++ new.txt.

    Note further that jsdiff exports three top-level constants that can be used as `headerOptions` values, named `INCLUDE_HEADERS` (the default), `FILE_HEADERS_ONLY`, and `OMIT_HEADERS`.
    
    (Note that in the case where `includeIndex` and `includeFileHeaders` are both false, the `oldFileName` and `newFileName` parameters are ignored entirely.)
    
    The GNU `patch` util will accept patches produced with any configuration of these header options (and refers to patch headers as "leading garbage", which in typical usage it makes no attempt to parse or use in any way). However, other tools for working with unified diff format patches may be less liberal (and are not unambiguously wrong to be so, since the format has no real standard). Tinkering with the `headerOptions` setting thus provides a way to help make patches produced by jsdiff compatible with other tools.
    
  • createPatch(fileName, oldStr, newStr[, oldHeader[, newHeader[, options]]]) - creates a unified diff patch.

    Just like createTwoFilesPatch, but with oldFileName being equal to newFileName.

  • formatPatch(patch[, headerOptions]) - creates a unified diff patch.

    patch may be either a single structured patch object (as returned by structuredPatch) or an array of them (as returned by parsePatch). The optional headerOptions argument behaves the same as the headerOptions option of createTwoFilesPatch, except that it is ignored for Git patches (i.e. patches where isGit is true).

    When a patch has isGit: true, formatPatch output is changed to more closely match Git's output: it emits a diff --git header, emits Git extended headers as appropriate based on properties like isRename, isCreate, newMode, etc, and always emits ---/+++ file headers when hunks are present but omits them when there are no hunks (e.g. renames without content changes). The headerOptions parameter has no effect on Git patches since the header format is fully determined by the Git extended header properties.

  • structuredPatch(oldFileName, newFileName, oldStr, newStr[, oldHeader[, newHeader[, options]]]) - returns an object with an array of hunk objects.

    This method is similar to createTwoFilesPatch, but returns a data structure suitable for further processing. Parameters are the same as createTwoFilesPatch. The data structure returned may look like this:

    js { oldFileName: 'oldfile', newFileName: 'newfile', oldHeader: 'header1', newHeader: 'header2', hunks: [{ oldStart: 1, oldLines: 3, newStart: 1, newLines: 3, lines: [' line2', ' line3', '-line4', '+line5', '\\ No newline at end of file'], }] }

  • applyPatch(source, patch[, options]) - attempts to apply a unified diff patch.

    Hunks are applied first to last. applyPatch first tries to apply the first hunk at the line number specified in the hunk header, and with all context lines matching exactly. If that fails, it tries scanning backwards and forwards, one line at a time, to find a place to apply the hunk where the context lines match exactly. If that still fails, and fuzzFactor is greater than zero, it increments the maximum number of mismatches (missing, extra, or changed context lines) that there can be between the hunk context and a region where we are trying to apply the patch such that the hunk will still be considered to match. Regardless of fuzzFactor, lines to be deleted in the hunk must be present for a hunk to match, and the context lines immediately before and after an insertion must match exactly.

    Once a hunk is successfully fitted, the process begins again with the next hunk. Regardless of fuzzFactor, later hunks must be applied later in the file than earlier hunks.

    If a hunk cannot be successfully fitted *anyw

Extension points exported contracts — how you extend this code

DraftChangeObject (Interface)
* Like a ChangeObject, but with no value and an extra `previousComponent` property. * A linked list of these (linked vi
src/diff/base.ts
DiffObj (Interface)
(no doc)
test-d/originalDefinitelyTypedTests.test-d.ts
ChangeObject (Interface)
(no doc)
src/types.ts
HeaderOptions (Interface)
(no doc)
src/patch/create.ts
Path (Interface)
(no doc)
src/diff/base.ts
CommonDiffOptions (Interface)
(no doc)
src/types.ts
_StructuredPatchOptionsAbortable (Interface)
(no doc)
src/patch/create.ts
TimeoutOption (Interface)
(no doc)
src/types.ts

Core symbols most depended-on inside this repo

parsePatch
called by 78
src/patch/parse.ts
convertChangesToXML
called by 75
src/convert/xml.ts
applyPatch
called by 72
src/patch/apply.ts
diffWords
called by 45
src/diff/word.ts
formatPatch
called by 43
src/patch/create.ts
createPatch
called by 29
src/patch/create.ts
diffLines
called by 23
src/diff/line.ts
diffChars
called by 20
src/diff/character.ts

Shape

Function 81
Interface 38
Method 30
Class 20

Languages

TypeScript100%

Modules by API surface

src/types.ts22 symbols
src/patch/create.ts19 symbols
src/diff/base.ts18 symbols
src/util/string.ts14 symbols
test-d/originalDefinitelyTypedTests.test-d.ts12 symbols
src/diff/word.ts12 symbols
src/patch/parse.ts11 symbols
src/patch/apply.ts8 symbols
src/diff/json.ts7 symbols
src/diff/line.ts6 symbols
src/diff/array.ts6 symbols
src/diff/sentence.ts5 symbols

Dependencies from manifests, versioned

@arethetypeswrong/cli0.18.2 · 1×
@babel/core7.29.0 · 1×
@babel/preset-env7.29.2 · 1×
@babel/register7.28.6 · 1×
@colors/colors1.6.0 · 1×
@eslint/js10.0.1 · 1×
babel-loader10.1.1 · 1×
babel-plugin-istanbul8.0.0 · 1×
chai6.2.2 · 1×
cross-env10.1.0 · 1×
eslint10.2.0 · 1×
globals17.5.0 · 1×

For agents

$ claude mcp add jsdiff \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact