1396 lines
53 KiB
Markdown
1396 lines
53 KiB
Markdown
|
# vctrs 0.6.5
|
||
|
|
||
|
* Internal changes requested by CRAN around C level format strings (#1896).
|
||
|
|
||
|
* Fixed tests related to changes to `dim<-()` in R-devel (#1889).
|
||
|
|
||
|
# vctrs 0.6.4
|
||
|
|
||
|
* Fixed a performance issue with `vec_c()` and ALTREP vectors (in particular,
|
||
|
the new ALTREP list vectors in R-devel) (#1884).
|
||
|
|
||
|
* Fixed an issue with complex vector tests related to changes in R-devel
|
||
|
(#1883).
|
||
|
|
||
|
* Added a class to the `vec_locate_matches()` error that is thrown when an
|
||
|
overflow would otherwise occur (#1845).
|
||
|
|
||
|
* Fixed an issue with `vec_rank()` and 0-column data frames (#1863).
|
||
|
|
||
|
# vctrs 0.6.3
|
||
|
|
||
|
* Fixed an issue where certain ALTREP row names were being materialized when
|
||
|
passed to `new_data_frame()`. We've fixed this by removing a safeguard in
|
||
|
`new_data_frame()` that performed a compatibility check when both `n` and
|
||
|
`row.names` were provided. Because this is a low level function designed for
|
||
|
performance, it is up to the caller to ensure these inputs are compatible
|
||
|
(tidyverse/dplyr#6596).
|
||
|
|
||
|
* Fixed an issue where `vec_set_*()` used with data frames could accidentally
|
||
|
return an object with the type of the proxy rather than the type of the
|
||
|
original inputs (#1837).
|
||
|
|
||
|
* Fixed a rare `vec_locate_matches()` bug that could occur when using a max/min
|
||
|
`filter` (tidyverse/dplyr#6835).
|
||
|
|
||
|
# vctrs 0.6.2
|
||
|
|
||
|
* Fixed conditional S3 registration to avoid a CRAN check NOTE that appears in
|
||
|
R >=4.3.0 (#1832).
|
||
|
|
||
|
* Fixed tests to maintain compatibility with the next version of waldo (#1829).
|
||
|
|
||
|
# vctrs 0.6.1
|
||
|
|
||
|
* Fixed a test related to `c.sfc()` changes in sf 1.0-10 (#1817).
|
||
|
|
||
|
# vctrs 0.6.0
|
||
|
|
||
|
* New `vec_run_sizes()` for computing the size of each run within a vector. It
|
||
|
is identical to the `times` column from `vec_unrep()`, but is faster if you
|
||
|
don't need the run key (#1210).
|
||
|
|
||
|
* New `sizes` argument to `vec_chop()` which allows you to partition a vector
|
||
|
using an integer vector describing the size of each expected slice. It is
|
||
|
particularly useful in combination with `vec_run_sizes()` and `list_sizes()`
|
||
|
(#1210, #1598).
|
||
|
|
||
|
* New `obj_is_vector()`, `obj_check_vector()`, and `vec_check_size()` validation
|
||
|
helpers. We believe these are a better approach to vector validation than
|
||
|
`vec_assert()` and `vec_is()`, which have been marked as questioning because
|
||
|
the semantics of their `ptype` arguments are hard to define and can often be
|
||
|
replaced by `vec_cast()` or a type predicate function like
|
||
|
`rlang::is_logical()` (#1784).
|
||
|
|
||
|
* `vec_is_list()` and `vec_check_list()` have been renamed to `obj_is_list()`
|
||
|
and `obj_check_list()`, in line with the new `obj_is_vector()` helper. The
|
||
|
old functions have been silently deprecated, but an official deprecation
|
||
|
process will start in the next vctrs release (#1803).
|
||
|
|
||
|
* `vec_locate_matches()` gains a new `relationship` argument that holistically
|
||
|
handles multiple matches between `needles` and `haystack`. In particular,
|
||
|
`relationship = "many-to-one"` replaces `multiple = "error"` and
|
||
|
`multiple = "warning"`, which have been removed from the documentation and
|
||
|
silently soft-deprecated. Official deprecation for those options will start in
|
||
|
a future release (#1791).
|
||
|
|
||
|
* `vec_locate_matches()` has changed its default `needles_arg` and
|
||
|
`haystack_arg` values from `""` to `"needles"` and `"haystack"`, respectively.
|
||
|
This generally generates more informative error messages (#1792).
|
||
|
|
||
|
* `vec_chop()` has gained empty `...` between `x` and the optional `indices`
|
||
|
argument. For backwards compatibility, supplying `vec_chop(x, indices)`
|
||
|
without naming `indices` still silently works, but will be deprecated in a
|
||
|
future release (#1813).
|
||
|
|
||
|
* `vec_slice()` has gained an `error_call` argument (#1785).
|
||
|
|
||
|
* The `numeric_version` type from base R is now better supported in equality,
|
||
|
comparison, and order based operations (tidyverse/dplyr#6680).
|
||
|
|
||
|
* R >=3.5.0 is now explicitly required. This is in line with the tidyverse
|
||
|
policy of supporting the [5 most recent versions of
|
||
|
R](https://www.tidyverse.org/blog/2019/04/r-version-support/).
|
||
|
|
||
|
# vctrs 0.5.2
|
||
|
|
||
|
* New `vec_expand_grid()`, which is a lower level helper that is similar to
|
||
|
`tidyr::expand_grid()` (#1325).
|
||
|
|
||
|
* New `vec_set_intersect()`, `vec_set_difference()`, `vec_set_union()`, and
|
||
|
`vec_set_symmetric_difference()` which compute set operations like
|
||
|
`intersect()`, `setdiff()`, and `union()`, but the vctrs variants don't strip
|
||
|
attributes and work with data frames (#1755, #1765).
|
||
|
|
||
|
* `vec_identify_runs()` is now faster when used with data frames (#1684).
|
||
|
|
||
|
* The maximum load factor of the internal dictionary was reduced from 77% to
|
||
|
50%, which improves performance of functions like `vec_match()`,
|
||
|
`vec_set_intersect()`, and `vec_unique()` in some cases (#1760).
|
||
|
|
||
|
* Fixed a bug with the internal `vec_order_radix()` function related to matrix
|
||
|
columns (#1753).
|
||
|
|
||
|
# vctrs 0.5.1
|
||
|
|
||
|
* Fix for CRAN checks.
|
||
|
|
||
|
|
||
|
# vctrs 0.5.0
|
||
|
|
||
|
* vctrs is now compliant with `-Wstrict-prototypes` as requested by CRAN
|
||
|
(#1729).
|
||
|
|
||
|
* `vec_ptype2()` now consistently falls back to bare data frame in
|
||
|
case of incompatible data frame subclasses. This is part of a
|
||
|
general move towards relaxed coercion rules.
|
||
|
|
||
|
* Common type and cast errors now inherit from `"vctrs_error_ptype2"`
|
||
|
and `"vctrs_error_cast"` respectively. They are still both
|
||
|
subclasses from `"vctrs_error_incompatible_type"` (which used to be
|
||
|
their most specific class and is now a parent class).
|
||
|
|
||
|
* New `list_all_size()` and `list_check_all_size()` to quickly determine if a
|
||
|
list contains elements of a particular `size` (#1582).
|
||
|
|
||
|
* `list_unchop()` has gained empty `...` to force optional arguments to be
|
||
|
named (#1715).
|
||
|
|
||
|
* `vec_rep_each(times = 0)` now works correctly with logical vectors that are
|
||
|
considered unspecified and with named vectors (#1673).
|
||
|
|
||
|
* `list_of()` was relaxed to make it easier to combine. It is now
|
||
|
coercible with `list()` (#1161). When incompatible `list_of()` types
|
||
|
are combined, the result is now a bare `list()`.
|
||
|
|
||
|
Following this change, the role of `list_of()` is mainly to carry
|
||
|
type information for potential optimisations, rather than to
|
||
|
guarantee a certain type throughout an analysis.
|
||
|
|
||
|
* `validate_list_of()` has been removed. It hasn't proven to be practically
|
||
|
useful, and isn't used by any packages on CRAN (#1697).
|
||
|
|
||
|
* Directed calls to `vec_c()`, like `vec_c(.ptype = <type>)`, now mention the
|
||
|
position of the problematic argument when there are cast errors (#1690).
|
||
|
|
||
|
* `list_unchop()` no longer drops names in some cases when `indices` were
|
||
|
supplied (#1689).
|
||
|
|
||
|
* `"unique_quiet"` and `"universal_quiet"` are newly accepted by
|
||
|
`vec_as_names(repair =)` and `vec_names2(repair =)`. These options exist to
|
||
|
help users who call these functions indirectly, via another function which
|
||
|
only exposes `repair` but not `quiet`. Specifying `repair = "unique_quiet"` is
|
||
|
like specifying `repair = "unique", quiet = TRUE`. When the `"*_quiet"`
|
||
|
options are used, any setting of `quiet` is silently overridden (@jennybc,
|
||
|
#1629).
|
||
|
|
||
|
`"unique_quiet"` and `"universal_quiet"` are also newly accepted for the name
|
||
|
repair argument of several other functions that do not expose a `quiet`
|
||
|
argument: `data_frame()`, `df_list()`, `vec_c()`, `list_unchop()`,
|
||
|
`vec_interleave()`, `vec_rbind()`, and `vec_cbind()` (@jennybc, #1716).
|
||
|
|
||
|
* `list_unchop()` has gained `error_call` and `error_arg` arguments (#1641,
|
||
|
#1692).
|
||
|
|
||
|
* `vec_c()` has gained `.error_call` and `.error_arg` arguments (#1641, #1692).
|
||
|
|
||
|
* Improved the performance of list-of common type methods (#1686, #875).
|
||
|
|
||
|
* The list-of method for `as_list_of()` now places the optional `.ptype`
|
||
|
argument after the `...` (#1686).
|
||
|
|
||
|
* `vec_rbind()` now applies `base::c()` fallback recursively within
|
||
|
packed df-cols (#1331, #1462, #1640).
|
||
|
|
||
|
* `vec_c()`, `vec_unchop()`, and `vec_rbind()` now proxy and restore
|
||
|
recursively (#1107). This prevents `vec_restore()` from being called
|
||
|
with partially filled vectors and improves performance (#1217,
|
||
|
#1496).
|
||
|
|
||
|
* New `vec_any_missing()` for quickly determining if a vector has any missing
|
||
|
values (#1672).
|
||
|
|
||
|
* `vec_equal_na()` has been renamed to `vec_detect_missing()` to align better
|
||
|
with vctrs naming conventions. `vec_equal_na()` will stick around for a few
|
||
|
minor versions, but has been formally soft-deprecated (#1672).
|
||
|
|
||
|
* `vec_c(outer = c(inner = 1))` now produces correct error messages (#522).
|
||
|
|
||
|
* If a data frame is returned as the proxy from `vec_proxy_equal()`,
|
||
|
`vec_proxy_compare()`, or `vec_proxy_order()`, then the corresponding proxy
|
||
|
function is now automatically applied recursively along all of the columns.
|
||
|
Additionally, packed data frame columns will be unpacked, and 1 column data
|
||
|
frames will be unwrapped. This ensures that the simplest possible types are
|
||
|
provided to the native C algorithms, improving both correctness and
|
||
|
performance (#1664).
|
||
|
|
||
|
* When used with record vectors, `vec_proxy_compare()` and `vec_proxy_order()`
|
||
|
now call the correct proxy function while recursing over the fields (#1664).
|
||
|
|
||
|
* The experimental function `vec_list_cast()` has been removed from
|
||
|
the package (#1382).
|
||
|
|
||
|
* Native classes like dates and datetimes now accept dimensions (#1290, #1329).
|
||
|
|
||
|
* `vec_compare()` now throws a more informative error when attempting to compare
|
||
|
complex vectors (#1655).
|
||
|
|
||
|
* `vec_rep()` and friends gain `error_call`, `x_arg`, and `times_arg`
|
||
|
arguments so they can be embedded in frontends (#1303).
|
||
|
|
||
|
* Record vectors now fail as expected when indexed along dimensions
|
||
|
greater than 1 (#1295).
|
||
|
|
||
|
* `vec_order()` and `vec_sort()` now have `...` between the required and
|
||
|
optional arguments to make them easier to extend (#1647).
|
||
|
|
||
|
* S3 vignette was extended to show how to make the polynomial class
|
||
|
atomic instead of a list (#1030).
|
||
|
|
||
|
* The experimental `n` argument of `vec_restore()` has been
|
||
|
removed. It was only used to inform on the size of data frames in
|
||
|
case a bare list is restored. It is now expected that bare lists be
|
||
|
initialised to data frame so that the size is carried through row
|
||
|
attributes. This makes the generic simpler and fixes some
|
||
|
performance issues (#650).
|
||
|
|
||
|
* The `anyNA()` method for `vctrs_vctr` (and thus `vctrs_list_of`) now
|
||
|
supports the `recursive` argument (#1278).
|
||
|
|
||
|
* `vec_as_location()` and `num_as_location()` have gained a `missing = "remove"`
|
||
|
option (#1595).
|
||
|
|
||
|
* `vec_as_location()` no longer matches `NA_character_` and `""` indices if
|
||
|
those invalid names appear in `names` (#1489).
|
||
|
|
||
|
* `vec_unchop()` has been renamed to `list_unchop()` to better indicate that it
|
||
|
requires list input. `vec_unchop()` will stick around for a few minor
|
||
|
versions, but has been formally soft-deprecated (#1209).
|
||
|
|
||
|
* Lossy cast errors during scalar subscript validation now have the
|
||
|
correct message (#1606).
|
||
|
|
||
|
* Fixed confusing error message with logical `[[` subscripts (#1608).
|
||
|
|
||
|
* New `vec_rank()` to compute various types of sample ranks (#1600).
|
||
|
|
||
|
* `num_as_location()` now throws the right error when there are out-of-bounds
|
||
|
negative values and `oob = "extend"` and `negative = "ignore"` are set
|
||
|
(#1614, #1630).
|
||
|
|
||
|
* `num_as_location()` now works correctly when a combination of `zero = "error"`
|
||
|
and `negative = "invert"` are used (#1612).
|
||
|
|
||
|
* `data_frame()` and `df_list()` have gained `.error_call` arguments (#1610).
|
||
|
|
||
|
* `vec_locate_matches()` has gained an `error_call` argument (#1611).
|
||
|
|
||
|
* `"select"` and `"relocate"` have been added as valid subscript actions to
|
||
|
support tidyselect and dplyr (#1596).
|
||
|
|
||
|
* `num_as_location()` has a new `oob = "remove"` argument to remove
|
||
|
out-of-bounds locations (#1595).
|
||
|
|
||
|
* `vec_rbind()` and `vec_cbind()` now have `.error_call` arguments (#1597).
|
||
|
|
||
|
* `df_list()` has gained a new `.unpack` argument to optionally disable data
|
||
|
frame unpacking (#1616).
|
||
|
|
||
|
* `vec_check_list(arg = "")` now throws the correct error (#1604).
|
||
|
|
||
|
* The `difftime` to `difftime` `vec_cast()` method now standardizes the internal
|
||
|
storage type to double, catching potentially corrupt integer storage
|
||
|
`difftime` vectors (#1602).
|
||
|
|
||
|
* `vec_as_location2()` and `vec_as_subscript2()` more correctly utilize their
|
||
|
`call` arguments (#1605).
|
||
|
|
||
|
* `vec_count(sort = "count")` now uses a stable sorting method. This ensures
|
||
|
that different keys with the same count are sorted in the order that they
|
||
|
originally appeared in (#1588).
|
||
|
|
||
|
* Lossy cast error conditions now show the correct message when
|
||
|
`conditionMessage()` is called on them (#1592).
|
||
|
|
||
|
* Fixed inconsistent reporting of conflicting inputs in
|
||
|
`vec_ptype_common()` (#1570).
|
||
|
|
||
|
* `vec_ptype_abbr()` and `vec_ptype_full()` now suffix 1d arrays
|
||
|
with `[1d]`.
|
||
|
|
||
|
* `vec_ptype_abbr()` and `vec_ptype_full()` methods are no longer
|
||
|
inherited (#1549).
|
||
|
|
||
|
* `vec_cast()` now throws the correct error when attempting to cast a subclassed
|
||
|
data frame to a non-data frame type (#1568).
|
||
|
|
||
|
* `vec_locate_matches()` now uses a more conservative heuristic when taking the
|
||
|
joint ordering proxy. This allows it to work correctly with sf's sfc vectors
|
||
|
and the classes from the bignum package (#1558).
|
||
|
|
||
|
* An sfc method for `vec_proxy_order()` was added to better support the sf
|
||
|
package. These vectors are generally treated like list-columns even though
|
||
|
they don't explicitly have a `"list"` class, and the `vec_proxy_order()`
|
||
|
method now forwards to the list method to reflect that (#1558).
|
||
|
|
||
|
* `vec_proxy_compare()` now works correctly for raw vectors wrapped in `I()`.
|
||
|
`vec_proxy_order()` now works correctly for raw and list vectors wrapped in
|
||
|
`I()` (#1557).
|
||
|
|
||
|
|
||
|
# vctrs 0.4.2
|
||
|
|
||
|
* HTML documentation fixes for CRAN checks.
|
||
|
|
||
|
|
||
|
# vctrs 0.4.1
|
||
|
|
||
|
* OOB errors with `character()` indexes use "that don't exist" instead
|
||
|
of "past the end" (#1543).
|
||
|
|
||
|
* Fixed memory protection issues related to common type
|
||
|
determination (#1551, tidyverse/tidyr#1348).
|
||
|
|
||
|
# vctrs 0.4.0
|
||
|
|
||
|
* New experimental `vec_locate_sorted_groups()` for returning the locations of
|
||
|
groups in sorted order. This is equivalent to, but faster than, calling
|
||
|
`vec_group_loc()` and then sorting by the `key` column of the result.
|
||
|
|
||
|
* New experimental `vec_locate_matches()` for locating where each observation
|
||
|
in one vector matches one or more observations in another vector. It is
|
||
|
similar to `vec_match()`, but returns all matches by default (rather than just
|
||
|
the first), and can match on binary conditions other than equality. The
|
||
|
algorithm is inspired by data.table's very fast binary merge procedure.
|
||
|
|
||
|
* The `vec_proxy_equal()`, `vec_proxy_compare()`, and `vec_proxy_order()`
|
||
|
methods for `vctrs_rcrd` are now applied recursively over the fields (#1503).
|
||
|
|
||
|
* Lossy cast errors now inherit from incompatible type errors.
|
||
|
|
||
|
* `vec_is_list()` now returns `TRUE` for `AsIs` lists (#1463).
|
||
|
|
||
|
* `vec_assert()`, `vec_ptype2()`, `vec_cast()`, and `vec_as_location()`
|
||
|
now use `caller_arg()` to infer a default `arg` value from the
|
||
|
caller.
|
||
|
|
||
|
This may result in unhelpful arguments being mentioned in error
|
||
|
messages. In general, you should consider snapshotting vctrs error
|
||
|
messages thrown in your package and supply `arg` and `call`
|
||
|
arguments if the error context is not adequately reported to your
|
||
|
users.
|
||
|
|
||
|
* `vec_ptype_common()`, `vec_cast_common()`, `vec_size_common()`, and
|
||
|
`vec_recycle_common()` gain `call` and `arg` arguments for
|
||
|
specifying an error context.
|
||
|
|
||
|
* `vec_compare()` can now compare zero column data frames (#1500).
|
||
|
|
||
|
* `new_data_frame()` now errors on negative and missing `n` values (#1477).
|
||
|
|
||
|
* `vec_order()` now correctly orders zero column data frames (#1499).
|
||
|
|
||
|
* vctrs now depends on cli to help with error message generation.
|
||
|
|
||
|
* New `vec_check_list()` and `list_check_all_vectors()` input
|
||
|
checkers, and an accompanying `list_all_vectors()` predicate.
|
||
|
|
||
|
* New `vec_interleave()` for combining multiple vectors together, interleaving
|
||
|
their elements in the process (#1396).
|
||
|
|
||
|
* `vec_equal_na(NULL)` now returns `logical(0)` rather than erroring (#1494).
|
||
|
|
||
|
* `vec_as_location(missing = "error")` now fails with `NA` and `NA_character_`
|
||
|
in addition to `NA_integer_` (#1420, @krlmlr).
|
||
|
|
||
|
* Starting with rlang 1.0.0, errors are displayed with the contextual
|
||
|
function call. Several vctrs operations gain a `call` argument that
|
||
|
makes it possible to report the correct context in error messages.
|
||
|
This concerns:
|
||
|
|
||
|
- `vec_cast()` and `vec_ptype2()`
|
||
|
- `vec_default_cast()` and `vec_default_ptype2()`
|
||
|
- `vec_assert()`
|
||
|
- `vec_as_names()`
|
||
|
- `stop_` constructors like `stop_incompatible_type()`
|
||
|
|
||
|
Note that default `vec_cast()` and `vec_ptype2()` methods
|
||
|
automatically support this if they pass `...` to the corresponding
|
||
|
`vec_default_` functions. If you throw a non-internal error from a
|
||
|
non-default method, add a `call = caller_env()` argument in the
|
||
|
method and pass it to `rlang::abort()`.
|
||
|
|
||
|
* If `NA_character_` is specified as a name for `vctrs_vctr` objects, it is
|
||
|
now automatically repaired to `""` (#780).
|
||
|
|
||
|
* `""` is now an allowed name for `vctrs_vctr` objects and all its
|
||
|
subclasses (`vctrs_list_of` in particular) (#780).
|
||
|
|
||
|
* `list_of()` is now much faster when many values are provided.
|
||
|
|
||
|
* `vec_as_location()` evaluates `arg` only in case of error, for performance
|
||
|
(#1150, @krlmlr).
|
||
|
|
||
|
* `levels.vctrs_vctr()` now returns `NULL` instead of failing (#1186, @krlmlr).
|
||
|
|
||
|
* `vec_assert()` produces a more informative error when `size` is invalid
|
||
|
(#1470).
|
||
|
|
||
|
* `vec_duplicate_detect()` is a bit faster when there are many unique values.
|
||
|
|
||
|
* `vec_proxy_order()` is described in `vignette("s3-vectors")` (#1373, @krlmlr).
|
||
|
|
||
|
* `vec_chop()` now materializes ALTREP vectors before chopping, which is more
|
||
|
efficient than creating many small ALTREP pieces (#1450).
|
||
|
|
||
|
* New `list_drop_empty()` for removing empty elements from a list (#1395).
|
||
|
|
||
|
* `list_sizes()` now propagates the names of the list onto the result.
|
||
|
|
||
|
* Name repair messages are now signaled by `rlang::names_inform_repair()`. This
|
||
|
means that the messages are now sent to stdout by default rather than to
|
||
|
stderr, resulting in prettier messages. Additionally, name repair messages can
|
||
|
now be silenced through the global option `rlib_name_repair_verbosity`, which
|
||
|
is useful for testing purposes. See `?names_inform_repair` for more
|
||
|
information (#1429).
|
||
|
|
||
|
* `vctrs_vctr` methods for `na.omit()`, `na.exclude()`, and `na.fail()` have
|
||
|
been added (#1413).
|
||
|
|
||
|
* `vec_init()` is now slightly faster (#1423).
|
||
|
|
||
|
* `vec_set_names()` no longer corrupts `vctrs_rcrd` types (#1419).
|
||
|
|
||
|
* `vec_detect_complete()` now computes completeness for `vctrs_rcrd` types in
|
||
|
the same way as data frames, which means that if any field is missing, the
|
||
|
entire record is considered incomplete (#1386).
|
||
|
|
||
|
* The `na_value` argument of `vec_order()` and `vec_sort()` now correctly
|
||
|
respect missing values in lists (#1401).
|
||
|
|
||
|
* `vec_rep()` and `vec_rep_each()` are much faster for `times = 0` and
|
||
|
`times = 1` (@mgirlich, #1392).
|
||
|
|
||
|
* `vec_equal_na()` and `vec_fill_missing()` now work with integer64 vectors
|
||
|
(#1304).
|
||
|
|
||
|
* The `xtfrm()` method for vctrs_vctr objects no longer accidentally breaks
|
||
|
ties (#1354).
|
||
|
|
||
|
* `min()`, `max()` and `range()` no longer throw an error if `na.rm = TRUE` is
|
||
|
set and all values are `NA` (@gorcha, #1357). In this case, and where an empty
|
||
|
input is given, it will return `Inf`/`-Inf`, or `NA` if `Inf` can't be cast
|
||
|
to the input type.
|
||
|
|
||
|
* `vec_group_loc()`, used for grouping in dplyr, now correctly handles
|
||
|
vectors with billions of elements (up to `.Machine$integer.max`) (#1133).
|
||
|
|
||
|
|
||
|
# vctrs 0.3.8
|
||
|
|
||
|
* Compatibility with next version of rlang.
|
||
|
|
||
|
|
||
|
# vctrs 0.3.7
|
||
|
|
||
|
* `vec_ptype_abbr()` gains arguments to control whether to indicate
|
||
|
named vectors with a prefix (`prefix_named`) and indicate shaped
|
||
|
vectors with a suffix (`suffix_shape`) (#781, @krlmlr).
|
||
|
|
||
|
* `vec_ptype()` is now an optional _performance_ generic. It is not necessary
|
||
|
to implement, but if your class has a static prototype, you might consider
|
||
|
implementing a custom `vec_ptype()` method that returns a constant to
|
||
|
improve performance in some cases (such as common type imputation).
|
||
|
|
||
|
* New `vec_detect_complete()`, inspired by `stats::complete.cases()`. For most
|
||
|
vectors, this is identical to `!vec_equal_na()`. For data frames and
|
||
|
matrices, this detects rows that only contain non-missing values.
|
||
|
|
||
|
* `vec_order()` can now order complex vectors (#1330).
|
||
|
|
||
|
* Removed dependency on digest in favor of `rlang::hash()`.
|
||
|
|
||
|
* Fixed an issue where `vctrs_rcrd` objects were not being proxied correctly
|
||
|
when used as a data frame column (#1318).
|
||
|
|
||
|
* `register_s3()` is now licensed with the "unlicense" which makes it very
|
||
|
clear that it's fine to copy and paste into your own package
|
||
|
(@maxheld83, #1254).
|
||
|
|
||
|
# vctrs 0.3.6
|
||
|
|
||
|
* Fixed an issue with tibble 3.0.0 where removing column names with
|
||
|
`names(x) <- NULL` is now deprecated (#1298).
|
||
|
|
||
|
* Fixed a GCC 11 issue revealed by CRAN checks.
|
||
|
|
||
|
|
||
|
# vctrs 0.3.5
|
||
|
|
||
|
* New experimental `vec_fill_missing()` for filling in missing values with
|
||
|
the previous or following value. It is similar to `tidyr::fill()`, but
|
||
|
also works with data frames and has an additional `max_fill` argument to
|
||
|
limit the number of sequential missing values to fill.
|
||
|
|
||
|
* New `vec_unrep()` to compress a vector with repeated values. It is very
|
||
|
similar to run length encoding, and works nicely alongside `vec_rep_each()`
|
||
|
as a way to invert the compression.
|
||
|
|
||
|
* `vec_cbind()` with only empty data frames now preserves the common size of
|
||
|
the inputs in the result (#1281).
|
||
|
|
||
|
* `vec_c()` now correctly returns a named result with named empty inputs
|
||
|
(#1263).
|
||
|
|
||
|
* vctrs has been relicensed as MIT (#1259).
|
||
|
|
||
|
* Functions that make comparisons within a single vector, such as
|
||
|
`vec_unique()`, or between two vectors, such as `vec_match()`, now
|
||
|
convert all character input to UTF-8 before making comparisons (#1246).
|
||
|
|
||
|
* New `vec_identify_runs()` which returns a vector of identifiers for the
|
||
|
elements of `x` that indicate which run of repeated values they fall in
|
||
|
(#1081).
|
||
|
|
||
|
* Fixed an encoding translation bug with lists containing data frames which
|
||
|
have columns where `vec_size()` is different from the low level
|
||
|
`Rf_length()` (#1233).
|
||
|
|
||
|
|
||
|
# vctrs 0.3.4
|
||
|
|
||
|
* Fixed a GCC sanitiser error revealed by CRAN checks.
|
||
|
|
||
|
|
||
|
# vctrs 0.3.3
|
||
|
|
||
|
* The `table` class is now implemented as a wrapper type that
|
||
|
delegates its coercion methods. It used to be restricted to integer
|
||
|
tables (#1190).
|
||
|
|
||
|
* Named one-dimensional arrays now behave consistently with simple
|
||
|
vectors in `vec_names()` and `vec_rbind()`.
|
||
|
|
||
|
* `new_rcrd()` now uses `df_list()` to validate the fields. This makes
|
||
|
it more flexible as the fields can now be of any type supported by
|
||
|
vctrs, including data frames.
|
||
|
|
||
|
* Thanks to the previous change the `[[` method of records now
|
||
|
preserves list fields (#1205).
|
||
|
|
||
|
* `vec_data()` now preserves data frames. This is consistent with the
|
||
|
notion that data frames are a primitive vector type in vctrs. This
|
||
|
shouldn't affect code that uses `[[` and `length()` to manipulate
|
||
|
the data. On the other hand, the vctrs primitives like `vec_slice()`
|
||
|
will now operate rowwise when `vec_data()` returns a data frame.
|
||
|
|
||
|
* `outer` is now passed unrecycled to name specifications. Instead,
|
||
|
the return value is recycled (#1099).
|
||
|
|
||
|
* Name specifications can now return `NULL`. The names vector will
|
||
|
only be allocated if the spec function returns non-`NULL` during the
|
||
|
concatenation. This makes it possible to ignore outer names without
|
||
|
having to create an empty names vector when there are no inner
|
||
|
names:
|
||
|
|
||
|
```
|
||
|
zap_outer_spec <- function(outer, inner) if (is_character(inner)) inner
|
||
|
|
||
|
# `NULL` names rather than a vector of ""
|
||
|
names(vec_c(a = 1:2, .name_spec = zap_outer_spec))
|
||
|
#> NULL
|
||
|
|
||
|
# Names are allocated when inner names exist
|
||
|
names(vec_c(a = 1:2, c(b = 3L), .name_spec = zap_outer_spec))
|
||
|
#> [1] "" "" "b"
|
||
|
```
|
||
|
|
||
|
* Fixed several performance issues in `vec_c()` and `vec_unchop()`
|
||
|
with named vectors.
|
||
|
|
||
|
* The restriction that S3 lists must have a list-based proxy to be considered
|
||
|
lists by `vec_is_list()` has been removed (#1208).
|
||
|
|
||
|
* New performant `data_frame()` constructor for creating data frames in a way
|
||
|
that follows tidyverse semantics. Among other things, inputs are recycled
|
||
|
using tidyverse recycling rules, strings are never converted to factors,
|
||
|
list-columns are easier to create, and unnamed data frame input is
|
||
|
automatically spliced.
|
||
|
|
||
|
* New `df_list()` for safely and consistently constructing the data structure
|
||
|
underlying a data frame, a named list of equal-length vectors. It is useful
|
||
|
in combination with `new_data_frame()` for creating user-friendly
|
||
|
constructors for data frame subclasses that use the tidyverse rules for
|
||
|
recycling and determining types.
|
||
|
|
||
|
* Fixed performance issue with `vec_order()` on classed vectors which
|
||
|
affected `dplyr::group_by()` (tidyverse/dplyr#5423).
|
||
|
|
||
|
* `vec_set_names()` no longer alters the input in-place (#1194).
|
||
|
|
||
|
* New `vec_proxy_order()` that provides an ordering proxy for use in
|
||
|
`vec_order()` and `vec_sort()`. The default method falls through to
|
||
|
`vec_proxy_compare()`. Lists are special cased, and return an integer
|
||
|
vector proxy that orders by first appearance.
|
||
|
|
||
|
* List columns in data frames are no longer comparable through `vec_compare()`.
|
||
|
|
||
|
* The experimental `relax` argument has been removed from
|
||
|
`vec_proxy_compare()`.
|
||
|
|
||
|
|
||
|
# vctrs 0.3.2
|
||
|
|
||
|
* Fixed a performance issue in `bind_rows()` with S3 columns (#1122,
|
||
|
#1124, #1151, tidyverse/dplyr#5327).
|
||
|
|
||
|
* `vec_slice()` now checks sizes of data frame columns in case the
|
||
|
data structure is corrupt (#552).
|
||
|
|
||
|
* The native routines in vctrs now dispatch and evaluate in the vctrs
|
||
|
namespace. This improves the continuity of evaluation in backtraces.
|
||
|
|
||
|
* `new_data_frame()` is now twice as fast when `class` is supplied.
|
||
|
|
||
|
* New `vec_names2()`, `vec_names()` and `vec_set_names()` (#1173).
|
||
|
|
||
|
|
||
|
# vctrs 0.3.1
|
||
|
|
||
|
* `vec_slice()` no longer restores attributes of foreign objects for
|
||
|
which a `[` method exist. This fixes an issue with `ts` objects
|
||
|
which were previously incorrectly restored.
|
||
|
|
||
|
* The `as.list()` method for `vctrs_rcrd` objects has been removed in favor
|
||
|
of directly using the method for `vctrs_vctr`, which calls `vec_chop()`.
|
||
|
|
||
|
* `vec_c()` and `vec_rbind()` now fall back to `base::c()` if the
|
||
|
inputs have a common class hierarchy for which a `c()` method is
|
||
|
implemented but no self-to-self `vec_ptype2()` method is
|
||
|
implemented.
|
||
|
|
||
|
* `vec_rbind()` now internally calls `vec_proxy()` and `vec_restore()` on
|
||
|
the data frame common type that is used to create the output (#1109).
|
||
|
|
||
|
* `vec_as_location2("0")` now works correctly (#1131).
|
||
|
|
||
|
* `?reference-faq-compatibility` is a new reference guide on vctrs
|
||
|
primitives. It includes an overview of the fallbacks to base R
|
||
|
generics implemented in vctrs for compatibility with existing
|
||
|
classes.
|
||
|
|
||
|
* The documentation of vctrs functions now includes a Dependencies
|
||
|
section to reference which other vctrs operations are called from
|
||
|
that function. By following the dependencies links recursively, you
|
||
|
will find the vctrs primitives on which an operation relies.
|
||
|
|
||
|
|
||
|
## CRAN results
|
||
|
|
||
|
* Fixed type declaration mismatches revealed by LTO build.
|
||
|
* Fixed r-devel issue with new `c.factor()` method.
|
||
|
|
||
|
|
||
|
# vctrs 0.3.0
|
||
|
|
||
|
This version features an overhaul of the coercion system to make it
|
||
|
more consistent and easier to implement. See the _Breaking changes_
|
||
|
and _Type system_ sections for details.
|
||
|
|
||
|
There are three new documentation topics if you'd like to learn how to
|
||
|
implement coercion methods to make your class compatible with
|
||
|
tidyverse packages like dplyr:
|
||
|
|
||
|
* https://vctrs.r-lib.org/reference/theory-faq-coercion.html for an
|
||
|
overview of the coercion mechanism in vctrs.
|
||
|
|
||
|
* https://vctrs.r-lib.org/reference/howto-faq-coercion.html for a
|
||
|
practical guide about implementing methods for vectors.
|
||
|
|
||
|
* https://vctrs.r-lib.org/reference/howto-faq-coercion-data-frame.html
|
||
|
for a practical guide about implementing methods for data frames.
|
||
|
|
||
|
|
||
|
## Reverse dependencies troubleshooting
|
||
|
|
||
|
The following errors are caused by breaking changes.
|
||
|
|
||
|
* `"Can't convert <character> to <list>."`
|
||
|
|
||
|
`vec_cast()` no longer converts to list. Use `vec_chop()` or
|
||
|
`as.list()` instead.
|
||
|
|
||
|
* `"Can't convert <integer> to <character>."`
|
||
|
|
||
|
`vec_cast()` no longer converts to character. Use `as.character()`to
|
||
|
deparse objects.
|
||
|
|
||
|
* `"names for target but not for current"`
|
||
|
|
||
|
Names of list-columns are now preserved by `vec_rbind()`. Adjust
|
||
|
tests accordingly.
|
||
|
|
||
|
|
||
|
## Breaking changes
|
||
|
|
||
|
* Double-dispatch methods for `vec_ptype2()` and `vec_cast()` are no
|
||
|
longer inherited (#710). Class implementers must implement one set
|
||
|
of methods for each compatible class.
|
||
|
|
||
|
For example, a tibble subclass no longer inherits from the
|
||
|
`vec_ptype2()` methods between `tbl_df` and `data.frame`. This means
|
||
|
that you explicitly need to implement `vec_ptype2()` methods with
|
||
|
`tbl_df` and `data.frame`.
|
||
|
|
||
|
This change requires a bit more work from class maintainers but is
|
||
|
safer because the coercion hierarchies are generally different from
|
||
|
class hierarchies. See the S3 dispatch section of `?vec_ptype2` for
|
||
|
more information.
|
||
|
|
||
|
* `vec_cast()` is now restricted to the same conversions as
|
||
|
`vec_ptype2()` methods (#606, #741). This change is motivated by
|
||
|
safety and performance:
|
||
|
|
||
|
- It is generally sloppy to generically convert arbitrary inputs to
|
||
|
one type. Restricted coercions are more predictable and allow your
|
||
|
code to fail earlier when there is a type issue.
|
||
|
|
||
|
- When unrestricted conversions are useful, this is generally
|
||
|
towards a known type. For example, `glue::glue()` needs to convert
|
||
|
arbitrary inputs to the known character type. In this case, using
|
||
|
double dispatch instead of a single dispatch generic like
|
||
|
`as.character()` is wasteful.
|
||
|
|
||
|
- To implement the useful semantics of coercible casts (already used
|
||
|
in `vec_assign()`), two double dispatch were needed. Now it can be
|
||
|
done with one double dispatch by calling `vec_cast()` directly.
|
||
|
|
||
|
* `stop_incompatible_cast()` now throws an error of class
|
||
|
`vctrs_error_incompatible_type` rather than `vctrs_error_incompatible_cast`.
|
||
|
This means that `vec_cast()` also throws errors of this class, which better
|
||
|
aligns it with `vec_ptype2()` now that they are restricted to the same
|
||
|
conversions.
|
||
|
|
||
|
* The `y` argument of `stop_incompatible_cast()` has been renamed to `to` to
|
||
|
better match `to_arg`.
|
||
|
|
||
|
|
||
|
## Type system
|
||
|
|
||
|
* Double-dispatch methods for `vec_ptype2()` and `vec_cast()` are now
|
||
|
easier to implement. They no longer need any the boiler plate.
|
||
|
Implementing a method for classes `foo` and `bar` is now as simple as:
|
||
|
|
||
|
```
|
||
|
#' @export
|
||
|
vec_ptype2.foo.bar <- function(x, y, ...) new_foo()
|
||
|
```
|
||
|
|
||
|
vctrs also takes care of implementing the default and unspecified
|
||
|
methods. If you have implemented these methods, they are no longer
|
||
|
called and can now be removed.
|
||
|
|
||
|
One consequence of the new dispatch mechanism is that `NextMethod()`
|
||
|
is now completely unsupported. This is for the best as it never
|
||
|
worked correctly in a double-dispatch setting. Parent methods must
|
||
|
now be called manually.
|
||
|
|
||
|
* `vec_ptype2()` methods now get zero-size prototypes as inputs. This
|
||
|
guarantees that methods do not peek at the data to determine the
|
||
|
richer type.
|
||
|
|
||
|
* `vec_is_list()` no longer allows S3 lists that implement a `vec_proxy()`
|
||
|
method to automatically be considered lists. A S3 list must explicitly
|
||
|
inherit from `"list"` in the base class to be considered a list.
|
||
|
|
||
|
* `vec_restore()` no longer restores row names if the target is not a
|
||
|
data frame. This fixes an issue where `POSIXlt` objects would carry
|
||
|
a `row.names` attribute after a proxy/restore roundtrip.
|
||
|
|
||
|
* `vec_cast()` to and from data frames preserves the row names of
|
||
|
inputs.
|
||
|
|
||
|
* The internal function `vec_names()` now returns row names if the
|
||
|
input is a data frame. Similarly, `vec_set_names()` sets row names
|
||
|
on data frames. This is part of a general effort at making row names
|
||
|
the vector names of data frames in vctrs.
|
||
|
|
||
|
If necessary, the row names are repaired verbosely but without error
|
||
|
to make them unique. This should be a mostly harmless change for
|
||
|
users, but it could break unit tests in packages if they make
|
||
|
assumptions about the row names.
|
||
|
|
||
|
|
||
|
## Compatibility and fallbacks
|
||
|
|
||
|
* With the double dispatch changes, the coercion methods are no longer
|
||
|
inherited from parent classes. This is because the coercion
|
||
|
hierarchy is in principle different from the S3 hierarchy. A
|
||
|
consequence of this change is that subclasses that don't implement
|
||
|
coercion methods are now in principle incompatible.
|
||
|
|
||
|
This is particularly problematic with subclasses of data frames for
|
||
|
which throwing incompatible errors would be too incovenient for
|
||
|
users. To work around this, we have implemented a fallback to the
|
||
|
relevant base data frame class (either `data.frame` or `tbl_df`) in
|
||
|
coercion methods (#981). This fallback is silent unless you set the
|
||
|
`vctrs:::warn_on_fallback` option to `TRUE`.
|
||
|
|
||
|
In the future we may extend this fallback principle to other base
|
||
|
types when they are explicitly included in the class vector (such as
|
||
|
`"list"`).
|
||
|
|
||
|
* Improved support for foreign classes in the combining operations
|
||
|
`vec_c()`, `vec_rbind()`, and `vec_unchop()`. A foreign class is a
|
||
|
class that doesn't implement `vec_ptype2()`. When all the objects to
|
||
|
combine have the same foreign class, one of these fallbacks is invoked:
|
||
|
|
||
|
- If the class implements a `base::c()` method, the method is used
|
||
|
for the combination. (FIXME: `vec_rbind()` currently doesn't use
|
||
|
this fallback.)
|
||
|
|
||
|
- Otherwise if the objects have identical attributes and the same
|
||
|
base type, we consider them to be compatible. The vectors are
|
||
|
concatenated and the attributes are restored (#776).
|
||
|
|
||
|
These fallbacks do not make your class completely compatible with
|
||
|
vctrs-powered packages, but they should help in many simple cases.
|
||
|
|
||
|
* `vec_c()` and `vec_unchop()` now fall back to `base::c()` for S4 objects if
|
||
|
the object doesn't implement `vec_ptype2()` but sets an S4 `c()`
|
||
|
method (#919).
|
||
|
|
||
|
|
||
|
## Vector operations
|
||
|
|
||
|
* `vec_rbind()` and `vec_c()` with data frame inputs now consistently
|
||
|
preserve the names of list-columns, df-columns, and matrix-columns
|
||
|
(#689). This can cause some false positives in unit tests, if they
|
||
|
are sensitive to internal names (#1007).
|
||
|
|
||
|
* `vec_rbind()` now repairs row names silently to avoid confusing
|
||
|
messages when the row names are not informative and were not created
|
||
|
on purpose.
|
||
|
|
||
|
* `vec_rbind()` gains option to treat input names as row names. This
|
||
|
is disabled by default (#966).
|
||
|
|
||
|
* New `vec_rep()` and `vec_rep_each()` for repeating an entire vector
|
||
|
and elements of a vector, respectively. These two functions provide
|
||
|
a clearer interface for the functionality of `vec_repeat()`, which
|
||
|
is now deprecated.
|
||
|
|
||
|
* `vec_cbind()` now calls `vec_restore()` on inputs emptied of their
|
||
|
columns before computing the common type. This has
|
||
|
consequences for data frame classes with special columns that
|
||
|
devolve into simpler classes when the columns are subsetted
|
||
|
out. These classes are now always simplified by `vec_cbind()`.
|
||
|
|
||
|
For instance, column-binding a grouped data frame with a data frame
|
||
|
now produces a tibble (the simplified class of a grouped data
|
||
|
frame).
|
||
|
|
||
|
* `vec_match()` and `vec_in()` gain parameters for argument tags (#944).
|
||
|
|
||
|
* The internal version of `vec_assign()` now has support for assigning
|
||
|
names and inner names. For data frames, the names are assigned
|
||
|
recursively.
|
||
|
|
||
|
* `vec_assign()` gains `x_arg` and `value_arg` parameters (#918).
|
||
|
|
||
|
* `vec_group_loc()`, which powers `dplyr::group_by()`, now has more
|
||
|
efficient vector access (#911).
|
||
|
|
||
|
* `vec_ptype()` gained an `x_arg` argument.
|
||
|
|
||
|
* New `list_sizes()` for computing the size of every element in a list.
|
||
|
`list_sizes()` is to `vec_size()` as `lengths()` is to `length()`, except
|
||
|
that it only supports lists. Atomic vectors and data frames result in an
|
||
|
error.
|
||
|
|
||
|
* `new_data_frame()` infers size from row names when `n = NULL` (#894).
|
||
|
|
||
|
* `vec_c()` now accepts `rlang::zap()` as `.name_spec` input. The
|
||
|
returned vector is then always unnamed, and the names do not cause
|
||
|
errors when they can't be combined. They are still used to create
|
||
|
more informative messages when the inputs have incompatible types (#232).
|
||
|
|
||
|
|
||
|
## Classes
|
||
|
|
||
|
* vctrs now supports the `data.table` class. The common type of a data
|
||
|
frame and a data table is a data table.
|
||
|
|
||
|
* `new_vctr()` now always appends a base `"list"` class to list `.data` to
|
||
|
be compatible with changes to `vec_is_list()`. This affects `new_list_of()`,
|
||
|
which now returns an object with a base class of `"list"`.
|
||
|
|
||
|
* dplyr methods are now implemented for `vec_restore()`,
|
||
|
`vec_ptype2()`, and `vec_cast()`. The user-visible consequence (and
|
||
|
breaking change) is that row-binding a grouped data frame and a data
|
||
|
frame or tibble now returns a grouped data frame. It would
|
||
|
previously return a tibble.
|
||
|
|
||
|
* The `is.na<-()` method for `vctrs_vctr` now supports numeric and
|
||
|
character subscripts to indicate where to insert missing values (#947).
|
||
|
|
||
|
* Improved support for vector-like S4 objects (#550, #551).
|
||
|
|
||
|
* The base classes `AsIs` and `table` have vctrs methods (#904, #906).
|
||
|
|
||
|
* `POSIXlt` and `POSIXct` vectors are handled more consistently (#901).
|
||
|
|
||
|
* Ordered factors that do not have identical levels are now incompatible.
|
||
|
They are now incompatible with all factors.
|
||
|
|
||
|
|
||
|
## Indexing and names
|
||
|
|
||
|
* `vec_as_subscript()` now fails when the subscript is a matrix or an
|
||
|
array, consistently with `vec_as_location()`.
|
||
|
|
||
|
* Improved error messages in `vec_as_location()` when subscript is a
|
||
|
matrix or array (#936).
|
||
|
|
||
|
* `vec_as_location2()` properly picks up `subscript_arg`
|
||
|
(tidyverse/tibble#735).
|
||
|
|
||
|
* `vec_as_names()` now has more informative error messages when names
|
||
|
are not unique (#882).
|
||
|
|
||
|
* `vec_as_names()` gains a `repair_arg` argument that when set will cause
|
||
|
`repair = "check_unique"` to generate an informative hint (#692).
|
||
|
|
||
|
|
||
|
## Conditions
|
||
|
|
||
|
* `stop_incompatible_type()` now has an `action` argument for customizing
|
||
|
whether the coercion error came from `vec_ptype2()` or `vec_cast()`.
|
||
|
`stop_incompatible_cast()` is now a thin wrapper around
|
||
|
`stop_incompatible_type(action = "convert")`.
|
||
|
|
||
|
* `stop_` functions now take `details` after the dots. This argument
|
||
|
can no longer be passed by position.
|
||
|
|
||
|
* Supplying both `details` and `message` to the `stop_` functions is
|
||
|
now an internal error.
|
||
|
|
||
|
* `x_arg`, `y_arg`, and `to_arg` are now compulsory arguments in
|
||
|
`stop_` functions like `stop_incompatible_type()`.
|
||
|
|
||
|
* Lossy cast errors are now considered internal. Please don't test for
|
||
|
the class or explicitly handle them.
|
||
|
|
||
|
* New argument `loss_type` for the experimental function
|
||
|
`maybe_lossy_cast()`. It can take the values "precision" or
|
||
|
"generality" to indicate in the error message which kind of loss is
|
||
|
the error about (double to integer loses precision, character to
|
||
|
factor loses generality).
|
||
|
|
||
|
* Coercion and recycling errors are now more consistent.
|
||
|
|
||
|
|
||
|
## CRAN results
|
||
|
|
||
|
* Fixed clang-UBSAN error "nan is outside the range of representable
|
||
|
values of type 'int'" (#902).
|
||
|
|
||
|
* Fixed compilation of stability vignette following the date
|
||
|
conversion changes on R-devel.
|
||
|
|
||
|
|
||
|
# vctrs 0.2.4
|
||
|
|
||
|
* Factors and dates methods are now implemented in C for efficiency.
|
||
|
|
||
|
* `new_data_frame()` now correctly updates attributes and supports merging
|
||
|
of the `"names"` and `"row.names"` arguments (#883).
|
||
|
|
||
|
* `vec_match()` gains an `na_equal` argument (#718).
|
||
|
|
||
|
* `vec_chop()`'s `indices` argument has been restricted to positive integer
|
||
|
vectors. Character and logical subscripts haven't proven useful, and this
|
||
|
aligns `vec_chop()` with `vec_unchop()`, for which only positive integer
|
||
|
vectors make sense.
|
||
|
|
||
|
* New `vec_unchop()` for combining a list of vectors into a single vector. It
|
||
|
is similar to `vec_c()`, but gives greater control over how the elements
|
||
|
are placed in the output through the use of a secondary `indices` argument.
|
||
|
|
||
|
* Breaking change: When `.id` is supplied, `vec_rbind()` now creates
|
||
|
the identifier column at the start of the data frame rather than at
|
||
|
the end.
|
||
|
|
||
|
* `numeric_version` and `package_version` lists are now treated as
|
||
|
vectors (#723).
|
||
|
|
||
|
* `vec_slice()` now properly handles symbols and S3 subscripts.
|
||
|
|
||
|
* `vec_as_location()` and `vec_as_subscript()` are now fully
|
||
|
implemented in C for efficiency.
|
||
|
|
||
|
* `num_as_location()` gains a new argument, `zero`, for controlling whether
|
||
|
to `"remove"`, `"ignore"`, or `"error"` on zero values (#852).
|
||
|
|
||
|
|
||
|
# vctrs 0.2.3
|
||
|
|
||
|
* The main feature of this release is considerable performance
|
||
|
improvements with factors and dates.
|
||
|
|
||
|
* `vec_c()` now falls back to `base::c()` if the vector doesn't
|
||
|
implement `vec_ptype2()` but implements `c()`. This should improve
|
||
|
the compatibility of vctrs-based functions with foreign classes
|
||
|
(#801).
|
||
|
|
||
|
* `new_data_frame()` is now faster.
|
||
|
|
||
|
* New `vec_is_list()` for detecting if a vector is a list in the vctrs sense.
|
||
|
For instance, objects of class `lm` are not lists. In general, classes need
|
||
|
to explicitly inherit from `"list"` to be considered as lists by vctrs.
|
||
|
|
||
|
* Unspecified vectors of `NA` can now be assigned into a list (#819).
|
||
|
|
||
|
```
|
||
|
x <- list(1, 2)
|
||
|
vec_slice(x, 1) <- NA
|
||
|
x
|
||
|
#> [[1]]
|
||
|
#> NULL
|
||
|
#>
|
||
|
#> [[2]]
|
||
|
#> 2
|
||
|
```
|
||
|
|
||
|
* `vec_ptype()` now errors on scalar inputs (#807).
|
||
|
|
||
|
* `vec_ptype_finalise()` is now recursive over all data frame types, ensuring
|
||
|
that unspecified columns are correctly finalised to logical (#800).
|
||
|
|
||
|
* `vec_ptype()` now correctly handles unspecified columns in data frames, and
|
||
|
will always return an unspecified column type (#800).
|
||
|
|
||
|
* `vec_slice()` and `vec_chop()` now work correctly with `bit64::integer64()`
|
||
|
objects when an `NA` subscript is supplied. By extension, this means that
|
||
|
`vec_init()` now works with these objects as well (#813).
|
||
|
|
||
|
* `vec_rbind()` now binds row names. When named inputs are supplied
|
||
|
and `names_to` is `NULL`, the names define row names. If `names_to`
|
||
|
is supplied, they are assigned in the column name as before.
|
||
|
|
||
|
* `vec_cbind()` now uses the row names of the first named input.
|
||
|
|
||
|
* The `c()` method for `vctrs_vctr` now throws an error when
|
||
|
`recursive` or `use.names` is supplied (#791).
|
||
|
|
||
|
|
||
|
# vctrs 0.2.2
|
||
|
|
||
|
* New `vec_as_subscript()` function to cast inputs to the base type
|
||
|
of a subscript (logical, numeric, or character). `vec_as_index()`
|
||
|
has been renamed to `vec_as_location()`. Use `num_as_location()` if
|
||
|
you need more options to control how numeric subscripts are
|
||
|
converted to a vector of locations.
|
||
|
|
||
|
* New `vec_as_subscript2()`, `vec_as_location2()`, and
|
||
|
`num_as_location2()` variants for validating scalar subscripts and
|
||
|
locations (e.g. for indexing with `[[`).
|
||
|
|
||
|
* `vec_as_location()` now preserves names of its inputs if possible.
|
||
|
|
||
|
* `vec_ptype2()` methods for base classes now prevent
|
||
|
inheritance. This makes sense because the subtyping graph created by
|
||
|
`vec_ptype2()` methods is generally not the same as the inheritance
|
||
|
relationships defined by S3 classes. For instance, subclasses are
|
||
|
often a richer type than their superclasses, and should often be
|
||
|
declared as supertypes (e.g. `vec_ptype2()` should return the
|
||
|
subclass).
|
||
|
|
||
|
We introduced this breaking change in a patch release because
|
||
|
`new_vctr()` now adds the base type to the class vector by default,
|
||
|
which caused `vec_ptype2()` to dispatch erroneously to the methods
|
||
|
for base types. We'll finish switching to this approach in vctrs
|
||
|
0.3.0 for the rest of the base S3 classes (dates, data frames, ...).
|
||
|
|
||
|
* `vec_equal_na()` now works with complex vectors.
|
||
|
|
||
|
* `vctrs_vctr` class gains an `as.POSIXlt()` method (#717).
|
||
|
|
||
|
* `vec_is()` now ignores names and row names (#707).
|
||
|
|
||
|
* `vec_slice()` now support Altvec vectors (@jimhester, #696).
|
||
|
|
||
|
* `vec_proxy_equal()` is now applied recursively across the columns of
|
||
|
data frames (#641).
|
||
|
|
||
|
* `vec_split()` no longer returns the `val` column as a `list_of`. It is now
|
||
|
returned as a bare list (#660).
|
||
|
|
||
|
* Complex numbers are now coercible with integer and double (#564).
|
||
|
|
||
|
* zeallot has been moved from Imports to Suggests, meaning that `%<-%` is no
|
||
|
longer re-exported from vctrs.
|
||
|
|
||
|
* `vec_equal()` no longer propagates missing values when comparing list
|
||
|
elements. This means that `vec_equal(list(NULL), list(NULL))` will continue to
|
||
|
return `NA` because `NULL` is the missing element for a list, but now
|
||
|
`vec_equal(list(NA), list(NA))` returns `TRUE` because the `NA` values are
|
||
|
compared directly without checking for missingness.
|
||
|
|
||
|
* Lists of expressions are now supported in `vec_equal()` and functions that
|
||
|
compare elements, such as `vec_unique()` and `vec_match()`. This ensures that
|
||
|
they work with the result of modeling functions like `glm()` and `mgcv::gam()`
|
||
|
which store "family" objects containing expressions (#643).
|
||
|
|
||
|
* `new_vctr()` gains an experimental `inherit_base_type` argument
|
||
|
which determines whether or not the class of the underlying type
|
||
|
will be included in the class.
|
||
|
|
||
|
* `list_of()` now inherits explicitly from "list" (#593).
|
||
|
|
||
|
* `vec_ptype()` has relaxed default behaviour for base types; now if two
|
||
|
vectors both inherit from (e.g.) "character", the common type is also
|
||
|
"character" (#497).
|
||
|
|
||
|
* `vec_equal()` now correctly treats `NULL` as the missing value element for
|
||
|
lists (#653).
|
||
|
|
||
|
* `vec_cast()` now casts data frames to lists rowwise, i.e. to a list of
|
||
|
data frames of size 1. This preserves the invariant of
|
||
|
`vec_size(vec_cast(x, to)) == vec_size(x)` (#639).
|
||
|
|
||
|
* Positive and negative 0 are now considered equivalent by all functions that
|
||
|
check for equality or uniqueness (#637).
|
||
|
|
||
|
* New experimental functions `vec_group_rle()` for returning run
|
||
|
length encoded groups; `vec_group_id()` for constructing group
|
||
|
identifiers from a vector; `vec_group_loc()` for computing the
|
||
|
locations of unique groups in a vector (#514).
|
||
|
|
||
|
* New `vec_chop()` for repeatedly slicing a vector. It efficiently captures
|
||
|
the pattern of `map(indices, vec_slice, x = x)`.
|
||
|
|
||
|
* Support for multiple character encodings has been added to functions that
|
||
|
compare elements within a single vector, such as `vec_unique()`, and across
|
||
|
multiple vectors, such as `vec_match()`. When multiple encodings are
|
||
|
encountered, a translation to UTF-8 is performed before any comparisons are
|
||
|
made (#600, #553).
|
||
|
|
||
|
* Equality and ordering methods are now implemented for raw and
|
||
|
complex vectors (@romainfrancois).
|
||
|
|
||
|
|
||
|
# vctrs 0.2.1
|
||
|
|
||
|
Maintenance release for CRAN checks.
|
||
|
|
||
|
|
||
|
# vctrs 0.2.0
|
||
|
|
||
|
With the 0.2.0 release, many vctrs functions have been rewritten with
|
||
|
native C code to improve performance. Functions like `vec_c()` and
|
||
|
`vec_rbind()` should now be fast enough to be used in packages. This
|
||
|
is an ongoing effort, for instance the handling of factors and dates
|
||
|
has not been rewritten yet. These classes still slow down vctrs
|
||
|
primitives.
|
||
|
|
||
|
The API in 0.2.0 has been updated, please see a list of breaking
|
||
|
changes below. vctrs has now graduated from experimental to a maturing
|
||
|
package.
|
||
|
Please note that API changes are still planned for future releases,
|
||
|
for instance `vec_ptype2()` and `vec_cast()` might need to return a
|
||
|
sentinel instead of failing with an error when there is no common type
|
||
|
or possible cast.
|
||
|
|
||
|
|
||
|
## Breaking changes
|
||
|
|
||
|
* Lossy casts now throw errors of type `vctrs_error_cast_lossy`.
|
||
|
Previously these were warnings. You can suppress these errors
|
||
|
selectively with `allow_lossy_cast()` to get the partial cast
|
||
|
results. To implement your own lossy cast operation, call the new
|
||
|
exported function `maybe_lossy_cast()`.
|
||
|
|
||
|
* `vec_c()` now fails when an input is supplied with a name but has
|
||
|
internal names or is length > 1:
|
||
|
|
||
|
```
|
||
|
vec_c(foo = c(a = 1))
|
||
|
#> Error: Can't merge the outer name `foo` with a named vector.
|
||
|
#> Please supply a `.name_spec` specification.
|
||
|
|
||
|
vec_c(foo = 1:3)
|
||
|
#> Error: Can't merge the outer name `foo` with a vector of length > 1.
|
||
|
#> Please supply a `.name_spec` specification.
|
||
|
```
|
||
|
|
||
|
You can supply a name specification that describes how to combine
|
||
|
the external name of the input with its internal names or positions:
|
||
|
|
||
|
```
|
||
|
# Name spec as glue string:
|
||
|
vec_c(foo = c(a = 1), .name_spec = "{outer}_{inner}")
|
||
|
|
||
|
# Name spec as a function:
|
||
|
vec_c(foo = c(a = 1), .name_spec = function(outer, inner) paste(outer, inner, sep = "_"))
|
||
|
vec_c(foo = c(a = 1), .name_spec = ~ paste(.x, .y, sep = "_"))
|
||
|
```
|
||
|
|
||
|
* `vec_empty()` has been renamed to `vec_is_empty()`.
|
||
|
|
||
|
* `vec_dim()` and `vec_dims()` are no longer exported.
|
||
|
|
||
|
* `vec_na()` has been renamed to `vec_init()`, as the primary use case
|
||
|
is to initialize an output container.
|
||
|
|
||
|
* `vec_slice<-` is now type stable (#140). It always returns the same
|
||
|
type as the LHS. If needed, the RHS is cast to the correct type, but
|
||
|
only if both inputs are coercible. See examples in `?vec_slice`.
|
||
|
|
||
|
* We have renamed the `type` particle to `ptype`:
|
||
|
|
||
|
- `vec_type()` => `vec_ptype()`
|
||
|
- `vec_type2()` => `vec_ptype2()`
|
||
|
- `vec_type_common()` => `vec_ptype_common()`
|
||
|
|
||
|
Consequently, `vec_ptype()` was renamed to `vec_ptype_show()`.
|
||
|
|
||
|
|
||
|
## New features
|
||
|
|
||
|
* New `vec_proxy()` generic. This is the main customisation point in
|
||
|
vctrs along with `vec_restore()`. You should only implement it when
|
||
|
your type is designed around a non-vector class (atomic vectors,
|
||
|
bare lists, data frames). In this case, `vec_proxy()` should return
|
||
|
such a vector class. The vctrs operations will be applied on the
|
||
|
proxy and `vec_restore()` is called to restore the original
|
||
|
representation of your type.
|
||
|
|
||
|
The most common case where you need to implement `vec_proxy()` is
|
||
|
for S3 lists. In vctrs, S3 lists are treated as scalars by
|
||
|
default. This way we don't treat objects like model fits as
|
||
|
vectors. To prevent vctrs from treating your S3 list as a scalar,
|
||
|
unclass it from the `vec_proxy()` method. For instance here is the
|
||
|
definition for `list_of`:
|
||
|
|
||
|
```
|
||
|
#' @export
|
||
|
vec_proxy.vctrs_list_of <- function(x) {
|
||
|
unclass(x)
|
||
|
}
|
||
|
```
|
||
|
|
||
|
If you inherit from `vctrs_vctr` or `vctrs_rcrd` you don't need to
|
||
|
implement `vec_proxy()`.
|
||
|
|
||
|
* `vec_c()`, `vec_rbind()`, and `vec_cbind()` gain a `.name_repair`
|
||
|
argument (#227, #229).
|
||
|
|
||
|
* `vec_c()`, `vec_rbind()`, `vec_cbind()`, and all functions relying
|
||
|
on `vec_ptype_common()` now have more informative error messages
|
||
|
when some of the inputs have nested data frames that are not
|
||
|
convergent:
|
||
|
|
||
|
```
|
||
|
df1 <- tibble(foo = tibble(bar = tibble(x = 1:3, y = letters[1:3])))
|
||
|
df2 <- tibble(foo = tibble(bar = tibble(x = 1:3, y = 4:6)))
|
||
|
|
||
|
vec_rbind(df1, df2)
|
||
|
#> Error: No common type for `..1$foo$bar$y` <character> and `..2$foo$bar$y` <integer>.
|
||
|
```
|
||
|
|
||
|
* `vec_cbind()` now turns named data frames to packed columns.
|
||
|
|
||
|
```r
|
||
|
data <- tibble::tibble(x = 1:3, y = letters[1:3])
|
||
|
data <- vec_cbind(data, packed = data)
|
||
|
data
|
||
|
# A tibble: 3 x 3
|
||
|
x y packed$x $y
|
||
|
<int> <chr> <int> <chr>
|
||
|
1 1 a 1 a
|
||
|
2 2 b 2 b
|
||
|
3 3 c 3 c
|
||
|
```
|
||
|
|
||
|
Packed data frames are nested in a single column. This makes it
|
||
|
possible to access it through a single name:
|
||
|
|
||
|
```r
|
||
|
data$packed
|
||
|
# A tibble: 3 x 2
|
||
|
x y
|
||
|
<int> <chr>
|
||
|
1 1 a
|
||
|
2 2 b
|
||
|
3 3 c
|
||
|
```
|
||
|
|
||
|
We are planning to use this syntax more widely in the tidyverse.
|
||
|
|
||
|
* New `vec_is()` function to check whether a vector conforms to a
|
||
|
prototype and/or a size. Unlike `vec_assert()`, it doesn't throw
|
||
|
errors but returns `TRUE` or `FALSE` (#79).
|
||
|
|
||
|
Called without a specific type or size, `vec_assert()` tests whether
|
||
|
an object is a data vector or a scalar. S3 lists are treated as
|
||
|
scalars by default. Implement a `vec_is_vector()` for your class to
|
||
|
override this property (or derive from `vctrs_vctr`).
|
||
|
|
||
|
* New `vec_order()` and `vec_sort()` for ordering and sorting
|
||
|
generalised vectors.
|
||
|
|
||
|
* New `.names_to` parameter for `vec_rbind()`. If supplied, this
|
||
|
should be the name of a column where the names of the inputs are
|
||
|
copied. This is similar to the `.id` parameter of
|
||
|
`dplyr::bind_rows()`.
|
||
|
|
||
|
* New `vec_seq_along()` and `vec_init_along()` create useful sequences (#189).
|
||
|
|
||
|
* `vec_slice()` now preserves character row names, if present.
|
||
|
|
||
|
* New `vec_split(x, by)` is a generalisation of `split()` that can divide
|
||
|
a vector into groups formed by the unique values of another vector. Returns
|
||
|
a two-column data frame containing unique values of `by` aligned with
|
||
|
matching `x` values (#196).
|
||
|
|
||
|
|
||
|
## Other features and bug fixes
|
||
|
|
||
|
* Using classed errors of class `"vctrs_error_assert"` for failed
|
||
|
assertions, and of class `"vctrs_error_incompatible"` (with
|
||
|
subclasses `_type`, `_cast` and `_op`) for errors on incompatible
|
||
|
types (#184).
|
||
|
|
||
|
* Character indexing is now only supported for named objects, an error
|
||
|
is raised for unnamed objects (#171).
|
||
|
|
||
|
* Predicate generics now consistently return logical vectors when
|
||
|
passed a `vctrs_vctr` class. They used to restore the output to
|
||
|
their input type (#251).
|
||
|
|
||
|
* `list_of()` now has an `as.character()` method. It uses
|
||
|
`vec_ptype_abbr()` to collapse complex objects into their type
|
||
|
representation (tidyverse/tidyr#654).
|
||
|
|
||
|
* New `stop_incompatible_size()` to signal a failure due to mismatched sizes.
|
||
|
|
||
|
* New `validate_list_of()` (#193).
|
||
|
|
||
|
* `vec_arith()` is consistent with base R when combining `difftime`
|
||
|
and `date`, with a warning if casts are lossy (#192).
|
||
|
|
||
|
* `vec_c()` and `vec_rbind()` now handle data.frame columns properly
|
||
|
(@yutannihilation, #182).
|
||
|
|
||
|
* `vec_cast(x, data.frame())` preserves the number of rows in `x`.
|
||
|
|
||
|
* `vec_equal()` now handles missing values symmetrically (#204).
|
||
|
|
||
|
* `vec_equal_na()` now returns `TRUE` for data frames and records when
|
||
|
every component is missing, not when _any_ component is missing
|
||
|
(#201).
|
||
|
|
||
|
* `vec_init()` checks input is a vector.
|
||
|
|
||
|
* `vec_proxy_compare()` gains an experimental `relax` argument, which
|
||
|
allows data frames to be orderable even if all their columns are not
|
||
|
(#210).
|
||
|
|
||
|
* `vec_size()` now works with positive short row names. This fixes
|
||
|
issues with data frames created with jsonlite (#220).
|
||
|
|
||
|
* `vec_slice<-` now has a `vec_assign()` alias. Use `vec_assign()`
|
||
|
when you don't want to modify the original input.
|
||
|
|
||
|
* `vec_slice()` now calls `vec_restore()` automatically. Unlike the
|
||
|
default `[` method from base R, attributes are preserved by default.
|
||
|
|
||
|
* `vec_slice()` can correct slice 0-row data frames (#179).
|
||
|
|
||
|
* New `vec_repeat()` for repeating each element of a vector the same number
|
||
|
of times.
|
||
|
|
||
|
* `vec_type2(x, data.frame())` ensures that the returned object has
|
||
|
names that are a length-0 character vector.
|