The formula to calculate the Inverse Document Frequency (IDF) is:
\[ IDF = \log\left(\frac{N}{n}\right) \]
Where:
Inverse Document Frequency (IDF) is a measure used in information retrieval and text mining to evaluate how important a word is to a document in a collection or corpus. The importance increases proportionally to the number of documents in the corpus that contain the word, and is offset by the frequency of the word in the corpus.
IDF is a key component of the TF-IDF (Term Frequency-Inverse Document Frequency) scoring scheme, which is commonly used in search engines and information retrieval systems to rank documents by relevance to a given search query.
Let's assume the following values:
Using the formula to calculate the Inverse Document Frequency:
\[ IDF = \log\left(\frac{1000}{10}\right) = \log(100) \approx 2 \]
The Inverse Document Frequency (IDF) is approximately 2.