Discover Top Posts Tagged with #ngrams

Playing around with the corpus of all transcripts from all episodes of CR. After removing the most common English words, these are the top 4-word groupings. Mentally constructing my prototypical CR episode now.

#critical role #python #jupyter #ngrams #critical role data #critical role wiki #nltk #I knew it was always a constitution saving throw That is 941 times of rolling damage #critical role things

Tool Time: Using Google Ngrams

I was sure I’d written about this before, but no. So.

Google Ngrams is a great, easy-to-use tool for finding the frequency of a word or phrase in printed material. Let’s say you want to know how popular the phrase “try and” is, compared to “try to.”

You go here and then you enter the phrases you want to search for, separated by a comma. (You can fiddle with the start/end dates, the corpus to be…

View On WordPress

#Garner #hills to die on #Modern English Usage #Ngrams #try and #try to #word choice

Cada + año

Para traducir la expresión "each year" solemos recurrir al literal "cada año". Ahora bien, el español tuvo (¿y tiene?) un vocablo más sintético:

cadañero

#Each year #Traducción #Palabras olvidadas #Ngrams

Did you know that usage of the word "the" in English has declined by one third in the last two hundred years?

In 1820, the comprised 6.17% of all written words in English, while in 2019, the percentage had declined to 4.17!

Considering that "the" is the most common word in English and that it is so grammatically essential, this is wild!

#english #lingustics #the #words #ngrams

python matching with ngrams

# https://pythonprogrammingsnippets.com def get_ngrams(text, n): # split text into n-grams. ngrams = [] for i in range(len(text)-n+1): ngrams.append(text[i:i+n]) return ngrams def compare_strings_ngram_pct(string1, string2, n): # compare two strings based on the percentage of matching n-grams # Split strings into n-grams string1_ngrams = get_ngrams(string1, n) string2_ngrams = get_ngrams(string2, n) # Find the number of matching n-grams matching_ngrams = set(string1_ngrams) & set(string2_ngrams) # Calculate the percentage match percentage_match = (len(matching_ngrams) / len(string1_ngrams)) * 100 return percentage_match def compare_strings_ngram_max_size(string1, string2): # compare two strings based on the maximum matching n-gram size # Split strings into n-grams of varying lengths n = min(len(string1), len(string2)) for i in range(n, 0, -1): string1_ngrams = set(get_ngrams(string1, i)) string2_ngrams = set(get_ngrams(string2, i)) # Find the number of matching n-grams matching_ngrams = string1_ngrams & string2_ngrams if len(matching_ngrams) > 0: # Return the maximum matching n-gram size and break out of the loop return i # If no matching n-grams are found, return 0 return 0 string1 = "hello world" string2 = "hello there" n = 2 # n-gram size # find how much of string 2 matches string 1 based on n-grams percentage_match = compare_strings_ngram_pct(string1, string2, n) print(f"The percentage match is: {percentage_match}%") # find maximum ngram size of matching ngrams max_match_size = compare_strings_ngram_max_size(string1, string2) print(f"The maximum matching n-gram size is: {max_match_size}")

Guest post by Dr. Kalev Leetaru Radio remains one of the most-consumed forms of traditional media today, with 89% of Americans listening to

#radio #broadcasting #digital preservation #GDELT #ngrams #data sets #internet archive #libraries #media #publishing #public health #archives

A chart from Google’s ngrams showing the relative frequencies of “he loves her”, “she loves him”, “she loves her” and”he loves him” - 1800 to 2000.

I know it’s the wrong language but the Les Mis fan in me loves that “he loves him” seems to peak in the 1820s - around about the time any real-life les Amis would have been doing their thing.

But otherwise, curious and kinda sad that “he loves him” is the only one of the four that appears less in the 20th century than in the 19th. It’s like seeing a macho “no homo” culture appear over time - boo to that.

#ngrams #linguistics #language #les amis #les miserables #les amis de l'abc #love is love

Do you love the word painterly but wish there were more words like it so you could demonstrate your complete lack of taste in words more frequently? Well here you go!

#painterly #ngrams

#ngrams

Trending Tags

Recently Viewed Tags

#ngrams