367 lines
13 KiB
Markdown
Raw Normal View History

2025-01-12 00:52:51 +08:00
# stringr 1.5.1
* Some minor documentation improvements.
* `str_trunc()` now correctly truncates strings when `side` is `"left"` or
`"center"` (@UchidaMizuki, #512).
# stringr 1.5.0
## Breaking changes
* stringr functions now consistently implement the tidyverse recycling rules
(#372). There are two main changes:
* Only vectors of length 1 are recycled. Previously, (e.g.)
`str_detect(letters, c("x", "y"))` worked, but it now errors.
* `str_c()` ignores `NULLs`, rather than treating them as length 0
vectors.
Additionally, many more arguments now throw errors, rather than warnings,
if supplied the wrong type of input.
* `regex()` and friends now generate class names with `stringr_` prefix (#384).
* `str_detect()`, `str_starts()`, `str_ends()` and `str_subset()` now error
when used with either an empty string (`""`) or a `boundary()`. These
operations didn't really make sense (`str_detect(x, "")` returned `TRUE`
for all non-empty strings) and made it easy to make mistakes when programming.
## New features
* Many tweaks to the documentation to make it more useful and consistent.
* New `vignette("from-base")` by @sastoudt provides a comprehensive comparison
between base R functions and their stringr equivalents. It's designed to
help you move to stringr if you're already familiar with base R string
functions (#266).
* New `str_escape()` escapes regular expression metacharacters, providing
an alternative to `fixed()` if you want to compose a pattern from user
supplied strings (#408).
* New `str_equal()` compares two character vectors using unicode rules,
optionally ignoring case (#381).
* `str_extract()` can now optionally extract a capturing group instead of
the complete match (#420).
* New `str_flatten_comma()` is a special case of `str_flatten()` designed for
comma separated flattening and can correctly apply the Oxford commas
when there are only two elements (#444).
* New `str_split_1()` is tailored for the special case of splitting up a single
string (#409).
* New `str_split_i()` extract a single piece from a string (#278, @bfgray3).
* New `str_like()` allows the use of SQL wildcards (#280, @rjpat).
* New `str_rank()` to complete the set of order/rank/sort functions (#353).
* New `str_sub_all()` to extract multiple substrings from each string.
* New `str_unique()` is a wrapper around `stri_unique()` and returns unique
string values in a character vector (#249, @seasmith).
* `str_view()` uses ANSI colouring rather than an HTML widget (#370). This
works in more places and requires fewer dependencies. It includes a number
of other small improvements:
* It no longer requires a pattern so you can use it to display strings with
special characters.
* It highlights unusual whitespace characters.
* It's vectorised over both string` and `pattern` (#407).
* It defaults to displaying all matches, making `str_view_all()` redundant
(and hence deprecated) (#455).
* New `str_width()` returns the display width of a string (#380).
* stringr is now licensed as MIT (#351).
## Minor improvements and bug fixes
* Better error message if you supply a non-string pattern (#378).
* A new data source for `sentences` has fixed many small errors.
* `str_extract()` and `str_exctract_all()` now work correctly when `pattern`
is a `boundary()`.
* `str_flatten()` gains a `last` argument that optionally override the
final separator (#377). It gains a `na.rm` argument to remove missing
values (since it's a summary function) (#439).
* `str_pad()` gains `use_width` argument to control whether to use the total
code point width or the number of code points as "width" of a string (#190).
* `str_replace()` and `str_replace_all()` can use standard tidyverse formula
shorthand for `replacement` function (#331).
* `str_starts()` and `str_ends()` now correctly respect regex operator
precedence (@carlganz).
* `str_wrap()` breaks only at whitespace by default; set
`whitespace_only = FALSE` to return to the previous behaviour (#335, @rjpat).
* `word()` now returns all the sentence when using a negative `start` parameter
that is greater or equal than the number of words. (@pdelboca, #245)
# stringr 1.4.1
Hot patch release to resolve R CMD check failures.
# stringr 1.4.0
* `str_interp()` now renders lists consistently independent on the presence of
additional placeholders (@amhrasmussen).
* New `str_starts()` and `str_ends()` functions to detect patterns at the
beginning or end of strings (@jonthegeek, #258).
* `str_subset()`, `str_detect()`, and `str_which()` get `negate` argument,
which is useful when you want the elements that do NOT match (#259,
@yutannihilation).
* New `str_to_sentence()` function to capitalize with sentence case
(@jonthegeek, #202).
# stringr 1.3.1
* `str_replace_all()` with a named vector now respects modifier functions (#207)
* `str_trunc()` is once again vectorised correctly (#203, @austin3dickey).
* `str_view()` handles `NA` values more gracefully (#217). I've also
tweaked the sizing policy so hopefully it should work better in notebooks,
while preserving the existing behaviour in knit documents (#232).
# stringr 1.3.0
## API changes
* During package build, you may see
`Error : object ignore.case is not exported by 'namespace:stringr'`.
This is because the long deprecated `str_join()`, `ignore.case()` and
`perl()` have now been removed.
## New features
* `str_glue()` and `str_glue_data()` provide convenient wrappers around
`glue` and `glue_data()` from the [glue](https://glue.tidyverse.org/) package
(#157).
* `str_flatten()` is a wrapper around `stri_flatten()` and clearly
conveys flattening a character vector into a single string (#186).
* `str_remove()` and `str_remove_all()` functions. These wrap
`str_replace()` and `str_replace_all()` to remove patterns from strings.
(@Shians, #178)
* `str_squish()` removes spaces from both the left and right side of strings,
and also converts multiple space (or space-like characters) to a single
space within strings (@stephlocke, #197).
* `str_sub()` gains `omit_na` argument for ignoring `NA`. Accordingly,
`str_replace()` now ignores `NA`s and keeps the original strings.
(@yutannihilation, #164)
## Bug fixes and minor improvements
* `str_trunc()` now preserves NAs (@ClaytonJY, #162)
* `str_trunc()` now throws an error when `width` is shorter than `ellipsis`
(@ClaytonJY, #163).
* Long deprecated `str_join()`, `ignore.case()` and `perl()` have now been
removed.
# stringr 1.2.0
## API changes
* `str_match_all()` now returns NA if an optional group doesn't match
(previously it returned ""). This is more consistent with `str_match()`
and other match failures (#134).
## New features
* In `str_replace()`, `replacement` can now be a function that is called once
for each match and whose return value is used to replace the match.
* New `str_which()` mimics `grep()` (#129).
* A new vignette (`vignette("regular-expressions")`) describes the
details of the regular expressions supported by stringr.
The main vignette (`vignette("stringr")`) has been updated to
give a high-level overview of the package.
## Minor improvements and bug fixes
* `str_order()` and `str_sort()` gain explicit `numeric` argument for sorting
mixed numbers and strings.
* `str_replace_all()` now throws an error if `replacement` is not a character
vector. If `replacement` is `NA_character_` it replaces the complete string
with replaces with `NA` (#124).
* All functions that take a locale (e.g. `str_to_lower()` and `str_sort()`)
default to "en" (English) to ensure that the default is consistent across
platforms.
# stringr 1.1.0
* Add sample datasets: `fruit`, `words` and `sentences`.
* `fixed()`, `regex()`, and `coll()` now throw an error if you use them with
anything other than a plain string (#60). I've clarified that the replacement
for `perl()` is `regex()` not `regexp()` (#61). `boundary()` has improved
defaults when splitting on non-word boundaries (#58, @lmullen).
* `str_detect()` now can detect boundaries (by checking for a `str_count()` > 0)
(#120). `str_subset()` works similarly.
* `str_extract()` and `str_extract_all()` now work with `boundary()`. This is
particularly useful if you want to extract logical constructs like words
or sentences. `str_extract_all()` respects the `simplify` argument
when used with `fixed()` matches.
* `str_subset()` now respects custom options for `fixed()` patterns
(#79, @gagolews).
* `str_replace()` and `str_replace_all()` now behave correctly when a
replacement string contains `$`s, `\\\\1`, etc. (#83, #99).
* `str_split()` gains a `simplify` argument to match `str_extract_all()`
etc.
* `str_view()` and `str_view_all()` create HTML widgets that display regular
expression matches (#96).
* `word()` returns `NA` for indexes greater than number of words (#112).
# stringr 1.0.0
* stringr is now powered by [stringi](https://github.com/gagolews/stringi)
instead of base R regular expressions. This improves unicode and support, and
makes most operations considerably faster. If you find stringr inadequate for
your string processing needs, I highly recommend looking at stringi in more
detail.
* stringr gains a vignette, currently a straight forward update of the article
that appeared in the R Journal.
* `str_c()` now returns a zero length vector if any of its inputs are
zero length vectors. This is consistent with all other functions, and
standard R recycling rules. Similarly, using `str_c("x", NA)` now
yields `NA`. If you want `"xNA"`, use `str_replace_na()` on the inputs.
* `str_replace_all()` gains a convenient syntax for applying multiple pairs of
pattern and replacement to the same vector:
```R
input <- c("abc", "def")
str_replace_all(input, c("[ad]" = "!", "[cf]" = "?"))
```
* `str_match()` now returns NA if an optional group doesn't match
(previously it returned ""). This is more consistent with `str_extract()`
and other match failures.
* New `str_subset()` keeps values that match a pattern. It's a convenient
wrapper for `x[str_detect(x)]` (#21, @jiho).
* New `str_order()` and `str_sort()` allow you to sort and order strings
in a specified locale.
* New `str_conv()` to convert strings from specified encoding to UTF-8.
* New modifier `boundary()` allows you to count, locate and split by
character, word, line and sentence boundaries.
* The documentation got a lot of love, and very similar functions (e.g.
first and all variants) are now documented together. This should hopefully
make it easier to locate the function you need.
* `ignore.case(x)` has been deprecated in favour of
`fixed|regex|coll(x, ignore.case = TRUE)`, `perl(x)` has been deprecated in
favour of `regex(x)`.
* `str_join()` is deprecated, please use `str_c()` instead.
# stringr 0.6.2
* fixed path in `str_wrap` example so works for more R installations.
* remove dependency on plyr
# stringr 0.6.1
* Zero input to `str_split_fixed` returns 0 row matrix with `n` columns
* Export `str_join`
# stringr 0.6
* new modifier `perl` that switches to Perl regular expressions
* `str_match` now uses new base function `regmatches` to extract matches -
this should hopefully be faster than my previous pure R algorithm
# stringr 0.5
* new `str_wrap` function which gives `strwrap` output in a more convenient
format
* new `word` function extract words from a string given user defined
separator (thanks to suggestion by David Cooper)
* `str_locate` now returns consistent type when matching empty string (thanks
to Stavros Macrakis)
* new `str_count` counts number of matches in a string.
* `str_pad` and `str_trim` receive performance tweaks - for large vectors this
should give at least a two order of magnitude speed up
* str_length returns NA for invalid multibyte strings
* fix small bug in internal `recyclable` function
# stringr 0.4
* all functions now vectorised with respect to string, pattern (and
where appropriate) replacement parameters
* fixed() function now tells stringr functions to use fixed matching, rather
than escaping the regular expression. Should improve performance for
large vectors.
* new ignore.case() modifier tells stringr functions to ignore case of
pattern.
* str_replace renamed to str_replace_all and new str_replace function added.
This makes str_replace consistent with all functions.
* new str_sub<- function (analogous to substring<-) for substring replacement
* str_sub now understands negative positions as a position from the end of
the string. -1 replaces Inf as indicator for string end.
* str_pad side argument can be left, right, or both (instead of center)
* str_trim gains side argument to better match str_pad
* stringr now has a namespace and imports plyr (rather than requiring it)
# stringr 0.3
* fixed() now also escapes |
* str_join() renamed to str_c()
* all functions more carefully check input and return informative error
messages if not as expected.
* add invert_match() function to convert a matrix of location of matches to
locations of non-matches
* add fixed() function to allow matching of fixed strings.
# stringr 0.2
* str_length now returns correct results when used with factors
* str_sub now correctly replaces Inf in end argument with length of string
* new function str_split_fixed returns fixed number of splits in a character
matrix
* str_split no longer uses strsplit to preserve trailing breaks