Sentencize.SUPPORTED_LANG
— ConstantSUPPORTED_LANG
Supported languages.
Contains:
- "ca"
- "cs"
- "da"
- "de"
- "el"
- "en"
- "es"
- "fi"
- "fr"
- "hu"
- "is"
- "it"
- "lt"
- "lv"
- "nl"
- "no"
- "pl"
- "pt"
- "ro"
- "ru"
- "sk"
- "sl"
- "sv"
- "tr"
Sentencize.Prefixes
— TypePrefixes(prefixes::Dict{T,PrefixType}=Dict{String,PrefixType}(); prefix_file::Union{String,Nothing}=nothing, lang::Union{String,Nothing}="en") where {T<:AbstractString}
Constructs Prefixes
.
Arguments
prefixes::Dict{<:AbstractString,PrefixType}=Dict{T,PrefixType}()
: Optional. A dictionary of non-breaking prefixes.prefix_file::Union{String,Nothing}=nothing
: Optional. A path to a file containing non-breaking prefixes to add to providedprefixes
.lang::AbstractString="en"
: Optional. The language of the non-breaking prefixes (see?SUPPORTED_LANG
for available languages) to be added toprefixes
.
Sentencize.split_sentence
— Methodsplit_sentence(text::AbstractString; prefixes::Dict{<:AbstractString,PrefixType}=Dict{String,PrefixType}(), prefix_file::Union{String,Nothing}=nothing, lang::Union{String,Nothing}="en")
Splits a text
into sentences.
Arguments
text::AbstractString
: The text to split into sentences.prefixes::Dict{<:AbstractString,PrefixType}
: Optional. A dictionary of non-breaking prefixes.prefix_file::Union{String,Nothing}
: Optional. A path to a file containing non-breaking prefixes to add to providedprefixes
.lang::Union{String,Nothing}
: Optional. The language of the non-breaking prefixes (see?SUPPORTED_LANG
for available languages) to be added toprefixes
Default is "en" (=English).
Examples
split_sentence("This is a paragraph. It contains several sentences. "But why," you ask?")
# Output: ["This is a paragraph.", "It contains several sentences.", ""But why," you ask?"]