Sentencize.SUPPORTED_LANGConstant
SUPPORTED_LANG

Supported languages.

Contains:

  • "ca"
  • "cs"
  • "da"
  • "de"
  • "el"
  • "en"
  • "es"
  • "fi"
  • "fr"
  • "hu"
  • "is"
  • "it"
  • "lt"
  • "lv"
  • "nl"
  • "no"
  • "pl"
  • "pt"
  • "ro"
  • "ru"
  • "sk"
  • "sl"
  • "sv"
  • "tr"
source
Sentencize.PrefixesType

Prefixes(prefixes::Dict{T,PrefixType}=Dict{String,PrefixType}(); prefix_file::Union{String,Nothing}=nothing, lang::Union{String,Nothing}="en") where {T<:AbstractString}

Constructs Prefixes.

Arguments

  • prefixes::Dict{<:AbstractString,PrefixType}=Dict{T,PrefixType}(): Optional. A dictionary of non-breaking prefixes.
  • prefix_file::Union{String,Nothing}=nothing: Optional. A path to a file containing non-breaking prefixes to add to provided prefixes.
  • lang::AbstractString="en": Optional. The language of the non-breaking prefixes (see ?SUPPORTED_LANG for available languages) to be added to prefixes.
source
Sentencize.split_sentenceMethod
split_sentence(text::AbstractString; prefixes::Dict{<:AbstractString,PrefixType}=Dict{String,PrefixType}(), prefix_file::Union{String,Nothing}=nothing, lang::Union{String,Nothing}="en")

Splits a text into sentences.

Arguments

  • text::AbstractString: The text to split into sentences.
  • prefixes::Dict{<:AbstractString,PrefixType}: Optional. A dictionary of non-breaking prefixes.
  • prefix_file::Union{String,Nothing}: Optional. A path to a file containing non-breaking prefixes to add to provided prefixes.
  • lang::Union{String,Nothing}: Optional. The language of the non-breaking prefixes (see ?SUPPORTED_LANG for available languages) to be added to prefixes Default is "en" (=English).

Examples

split_sentence("This is a paragraph. It contains several sentences. "But why," you ask?")
# Output: ["This is a paragraph.", "It contains several sentences.", ""But why," you ask?"]
source