加入收藏
大学数学, 研究生数学,大学数学资料下载,免费 大学数学课件,研究生数学课件,免费下载
R语言- sort 函数 (中文帮助)
2015-12-16 14:43:28

sort {base} 

[转载请注明出处,胡桃木屋 mathapply.cn ”R语言中文帮助“工作室译]  


Sorting or Ordering Vectors

向量排序或求秩


Description(描述)

Sort (or order) a vector or factor (partially) into ascending or descending order. For ordering along more than one variable, e.g., for sorting data frames, see order.

按升序或降序对一个向量或因子进行排序(或求秩)。为了对多个变量进行排序,如:对数据帧的排序,见order函数。


Usage(用法)
sort(x, decreasing = FALSE, ...)

## Default S3 method:

##默认的S3方法
 sort(x, decreasing = FALSE, na.last = NA, ...)

sort.int(x, partial = NULL, na.last = NA, decreasing = FALSE,
         method = c("shell", "quick"), index.return = FALSE)


Arguments(参数)

x
     for sort an R object with a class or a numeric, complex, character or logical vector. For sort.int, a numeric, complex, character or logical vector, or a factor.

   对于sort 参数x是一个带有类、数值、复数、字符或逻辑向量的R对象。对于sort.int, 参数x是一个数值、复数、字符或逻辑向量,或一个因子。

decreasing
     logical. Should the sort be increasing or decreasing? Not available for partial sorting.

     该参数为逻辑类型。用于指定是按升序还是降序排序。不能用于部分排序。
    ...
     arguments to be passed to or from methods or (for the default methods and objects without a class) to sort.int.
     参数.. 可以省略或来自其它方法,或者来自sort.int (对于默认的方法或不是类的对象)
na.last
     for controlling the treatment of NAs. If TRUE, missing values in the data are put last; if FALSE, they are put first; if NA, they are removed.
     该参数用于控制如何处理NA的方式。 如果该参数为TRUE, 缺失值被放在最后,如果该参数为FALSE,缺失值被放在最前面,如果值为NA, 则把确实值剔除。
partial
     NULL or a vector of indices for partial sorting.
     该参数为NULL或或者用于部分排序的一个指标向量。
method
     character string specifying the algorithm used. Not available for partial sorting. Can be abbreviated. 
      用于指定算法的字符串。不能用于部分排序。可以缩写。
index.return
     logical indicating if the ordering index vector should be returned as well; this is only available for a few cases, the default na.last = NA and full sorting of non-factors.

    该参数为逻辑型,用于指定是否也返回排序的指标;这仅用于少数的情况,如:默认的na.last=NA 和 非因子的全排序。


Details(详细描述)

    sort is a generic function for which methods can be written, and sort.int is the internal method which is compatible with S if only the first three arguments are used.

    sort是方法可写的泛型函数,而sort.int是内部方法,如果只使用前三个参数就和S语言兼容。

    The default sort method makes use of order for classed objects, which in turn makes use of the generic function xtfrm (and can be slow unless a xtfrm method has been defined or is.numeric(x) is true).

    默认的sort方法会使用分类对象的方法order, 该方法依次使用泛型函数xtfrm(默认方法会变慢除非xtfrm方法已经被定义或is.numeric(x)为真。)

    Complex values are sorted first by the real part, then the imaginary part.

    复数值排序首先按其实部排序,然后是虚部。

    The sort order for character vectors will depend on the collating sequence of the locale in use: see Comparison. The sort order for factors is the order of their levels (which is particularly appropriate for ordered factors).

    字符向量的排序顺序取决于正在使用的环境里的排序序列。见Comparision方法。因子的排序顺序是其水平的顺序。(这特别适合有序的因素)

    If partial is not NULL, it is taken to contain indices of elements of the result which are to be placed in their correct positions in the sorted array by partial sorting. For each of the result values in a specified position, any values smaller than that one are guaranteed to have a smaller index in the sorted array and any values which are greater are guaranteed to have a bigger index in the sorted array. (This is included for efficiency, and many of the options are not available for partial sorting. It is only substantially more efficient if partial has a handful of elements, and a full sort is done (a Quicksort if possible) if there are more than 10.) Names are discarded for partial sorting.

    如果partial参数非空,则采用在排序结果中元素的下标,它用来表示在部分排序中元素所处的正确的序号位置。 对于指定位置中的每一个结果值,任何小于保证在排序数组中具有较小的索引值的值,以及保证在排序数组中有更大索引的任何值。(这是包括效率,和许多的选项是不可用于部分排序。如果partial设定只有少数元素,排序基本上有更高的效率,如果有10个以上,全排序完成会更有效。(如果可能的话是一个快速排序)。) 对于部分排序名称会被去掉。

    Method "shell" uses Shellsort (an O(n^{4/3}) variant from Sedgewick (1986)). If x has names a stable modification is used, so ties are not reordered. (This only matters if names are present.)

    方法"shell"使用shellsort(时间复杂度为O(n^{4/3},该方法由Sedgewick (1986)变化而来)方法。

    Method "quick" uses Singleton (1969)'s implementation of Hoare's Quicksort method and is only available when x is numeric (double or integer) and partial is NULL. (For other types of x Shellsort is used, silently.) It is normally somewhat faster than Shellsort (perhaps 50% faster on vectors of length a million and twice as fast at a billion) but has poor performance in the rare worst case. (Peto's modification using a pseudo-random midpoint is used to make the worst case rarer.) This is not a stable sort, and ties may be reordered.

    方法“quick”使用Singleton (1969)Hoare的快速排序的实现,它仅允许当x是数值型(双精度或整数)和partial参数为NULL时使用。(对于其他类型的x,默认使用shellsort方法)它通常比shellshort方法快一点(在长度为1百万的向量排序时大约快50%,在长度为十亿的向量排序时大约快2倍)在比较极端的情况下效率很差。(Peto修正方法采用伪随机中点使得最极端情况变得更少)这是一个不稳定的排序,有可能会重新排序。

    Factors with less than 100,000 levels are sorted by radix sorting when method is not supplied: see sort.list.

    当方法不支持时,少于100,000个水平的因子会通过基数排序的方法进行排序:见sort.list。


Value(返回值)

    For sort, the result depends on the S3 method which is dispatched. If x does not have a class sort.int is used and it description applies. For classed objects which do not have a specific method the default method will be used and is equivalent to x[order(x, ...)]: this depends on the class having a suitable method for [ (and also that order will work, which is not the case for a class based on a list).

   对于sort方法,结果取决于S3方法。如果X中没有一个类,则会使用sort.int方法,并且它的描述是合适的。对于没有特定方法的类对象,默认的方法将被使用,并且相当于×[order(×,…)]:这取决于是否有一个合适该类的方法(也就是说,该排序的命令还是会执行,这不是一个基于一个列表的基类的情况)。

    For sort.int the value is the sorted vector unless index.return is true, when the result is a list with components named x and ix containing the sorted numbers and the ordering index vector. In the latter case, if method == "quick" ties may be reversed in the ordering (unlike sort.list) as quicksort is not stable. NB: the index vector refers to element numbers after removal of NAs: see order if you want the original element numbers.

    除非设定参数index.return为真, 对于sort.int,返回值是一个排序好的向量。当index.return设置为真时,结果是一个由命名了的x、包含排序数字的ix和序数指标向量。在后面这种情况,如果method=="quick",在排序过程中排序关系可能要逆转(不像sort.list),  作为快速排序是不稳定的。注:索引向量是指向除去NAS后的元数,如果你想看到原来的元数,参考order。

    All attributes are removed from the return value (see Becker et al, 1988, p.146) except names, which are sorted. (If partial is specified even the names are removed.) Note that this means that the returned value has no class, except for factors and ordered factors (which are treated specially and whose result is transformed back to the original class).

    除了names属性外,返回值中其它所有的属性都被去掉(见Becker 等,1988,p.146),然后再进行排序。(如果partial被指定,即使是names属性也被去掉)这意味着返回值没有类,除了对因子或排序因子(它们被特殊对待,返回的结果被转换成原来的类)。


References(参考文献)

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language. Wadsworth & Brooks/Cole.

Knuth, D. E. (1998) The Art of Computer Programming, Volume 3: Sorting and Searching. 2nd ed. Addison-Wesley.

Sedgewick, R. (1986) A new upper bound for Shell sort. J. Algorithms 7, 159–173.

Singleton, R. C. (1969) An efficient algorithm for sorting with minimal storage: Algorithm 347. Communications of the ACM 12, 185–187.


See Also(另见)

‘Comparison’ for how character strings are collated.

‘Comparison’ 用于如何整理字符串

order for sorting on or reordering multiple variables.

order 用于多维变量的排序或重排

is.unsorted. rank.


Examples(示例)
require(stats)

x <- swiss$Education[1:25]
x; sort(x); sort(x, partial = c(10, 15))

## illustrate 'stable' sorting (of ties):
sort(c(10:3, 2:12), method = "sh", index.return = TRUE) # is stable
## $x : 2  3  3  4  4  5  5  6  6  7  7  8  8  9  9 10 10 11 12
## $ix: 9  8 10  7 11  6 12  5 13  4 14  3 15  2 16  1 17 18 19
sort(c(10:3, 2:12), method = "qu", index.return = TRUE) # is not
## $x : 2  3  3  4  4  5  5  6  6  7  7  8  8  9  9 10 10 11 12
## $ix: 9 10  8  7 11  6 12  5 13  4 14  3 15 16  2 17  1 18 19

x <- c(1:3, 3:5, 10)
is.unsorted(x)                  # FALSE: is sorted
is.unsorted(x, strictly = TRUE) # TRUE : is not (and cannot be)
                                # sorted strictly
## Not run:
## Small speed comparison simulation:
N <- 2000
Sim <- 20
rep <- 1000 # << adjust to your CPU
c1 <- c2 <- numeric(Sim)
for(is in seq_len(Sim)){
  x <- rnorm(N)
  c1[is] <- system.time(for(i in 1:rep) sort(x, method = "shell"))[1]
  c2[is] <- system.time(for(i in 1:rep) sort(x, method = "quick"))[1]
  stopifnot(sort(x, method = "s") == sort(x, method = "q"))
}
rbind(ShellSort = c1, QuickSort = c2)
cat("Speedup factor of quick sort():\n")
summary({qq <- c1 / c2; qq[is.finite(qq)]})

## A larger test
x <- rnorm(1e7)
system.time(x1 <- sort(x, method = "shell"))
system.time(x2 <- sort(x, method = "quick"))
stopifnot(identical(x1, x2))

## End(Not run)


[Package base version 3.2.2 Index]
赞一个(109) | 阅读(9752)
上一篇:R语言- sum 函数 (中文帮助)
下一篇:R语言--Stats-统计包目录
 

胡桃木屋版权所有@2013 湘ICP备13006789号-1