NumPy-按频率快速稳定地对大型数组进行arg排序（数组按照频率排序）

25-04-23 4

最近很多小伙伴都在问NumPy-按频率快速稳定地对大型数组进行arg排序和数组按照频率排序这两个问题，那么本篇文章就来给大家详细解答一下，同时本文还将给你拓展"importnumpyasnp"Impo

最近很多小伙伴都在问NumPy-按频率快速稳定地对大型数组进行arg排序和数组按照频率排序这两个问题，那么本篇文章就来给大家详细解答一下，同时本文还将给你拓展"import numpy as np" ImportError: No module named numpy、3.7Python 数据处理篇之 Numpy 系列 (七)---Numpy 的统计函数、Anaconda Numpy 错误“Importing the Numpy C Extension Failed”是否有另一种解决方案、Difference between import numpy and import numpy as np等相关知识，下面开始了哦！

本文目录一览：

NumPy-按频率快速稳定地对大型数组进行arg排序（数组按照频率排序）
"import numpy as np" ImportError: No module named numpy
3.7Python 数据处理篇之 Numpy 系列 (七)---Numpy 的统计函数
Anaconda Numpy 错误“Importing the Numpy C Extension Failed”是否有另一种解决方案
Difference between import numpy and import numpy as np

NumPy-按频率快速稳定地对大型数组进行arg排序（数组按照频率排序）

如何解决NumPy-按频率快速稳定地对大型数组进行arg排序

我有任何类似a的大型一维NumPy数组dtype，它的某些元素可能会重复。

如何找到排序索引ix，该索引将按值的降序/升序对a sense described here进行稳定排序（question here中的稳定性）？

我想找到最快最简单的方法。也许现有的标准numpy函数可以做到这一点。

还有另一个相关的Try it online!，但它专门要求删除重复的数组，即仅输出唯一的排序值，我需要原始数组的所有值，包括重复的值。

我已经编写了第一个试用版来完成该任务，但是它不是最快的（使用Python的循环），并且可能不是最短/最简单的形式。如果相等元素的重复次数不高且数组很大，则此python循环可能会非常昂贵。如果在NumPy中可用（例如，虚构的a），那么具有简短的功能来完成所有操作也将是一件好事。

np.argsort_by_freq()

输出：

import numpy as np
np.random.seed(1)
hi,n,desc = 7,24,True
a = np.random.choice(np.arange(hi),(n,),p = (
    lambda p = np.random.random((hi,)): p / p.sum()
)())
us,cs = np.unique(a,return_counts = True)
af = np.zeros(n,dtype = np.int64)
for u,c in zip(us,cs):
    af[a == u] = c
if desc:
    ix = np.argsort(-af,kind = ''stable'') # Descending sort
else:
    ix = np.argsort(af,kind = ''stable'') # Ascending sort
print(''rows: i_col(0) / original_a(1) / freqs(2) / sorted_a(3)'')
print(''    / sorted_freqs(4) / sorting_ix(5)'')
print(np.stack((
    np.arange(n),a,af,a[ix],af[ix],ix,0))

解决方法

我可能会丢失一些东西，但是似乎可以使用Counter然后根据该元素的值的计数，使用该元素的值，然后使用该索引断开联系，对每个元素的索引进行排序。例如：

from collections import Counter
a = [ 1,1,3,5,4,6,0]
counts = Counter(a)
t = [(counts[v],v,i) for i,v in enumerate(a)]
t.sort()
print([v[2] for v in t])
t.sort(reverse=True)
print([v[2] for v in t])

输出：

[13,14,8,16,17,18,22,2,9,10,15,7,11,12,19,20,21,23]
[23,13]

如果要保持具有相等计数的组的索引的升序，则可以对降序使用lambda函数：

t.sort(key = lambda x:(-x[0],-x[1],x[2]))
print([v[2] for v in t])

输出：

[5,23,13]

如果要按照元素最初出现在数组中的顺序来保持元素的顺序如果它们的计数相同，则不要按值排序，而按其索引排序数组中的第一个匹配项：

a = [ 1,0]
counts = Counter(a)
idxs = {}
t = []
for i,v in enumerate(a):
    if not v in idxs:
        idxs[v] = i
    t.append((counts[v],idxs[v],i))
t.sort()
print([v[2] for v in t])
t.sort(key = lambda x:(-x[0],x[1],x[2]))
print([v[2] for v in t])

输出：

[13,23]
[5,13,14]

要根据计数排序，然后再在数组中定位，则完全不需要该值或第一个索引：

from collections import Counter
a = [ 1,v in enumerate(a)]
t.sort()
print([v[1] for v in t])
t.sort(key = lambda x:(-x[0],x[1]))
print([v[1] for v in t])

对于字符串数组，这将产生与示例数据的先前代码相同的输出：

a = [''g'',''g'',''c'',''f'',''d'',''a'',''e'',''b'',''f'' ]

这将产生输出：

[18,24]
[3,24,18]

我只是使用不带python循环的numpy函数就可以解决任何dtype的非常快速的解决方案，它可以在O(N log N)时间内工作。使用的numpy函数：np.unique，np.argsort和数组索引。

尽管在原始问题中并未询问，但我实现了额外的标记equal_order_by_val，如果它为False，则将具有相同频率的数组元素排序为相等的稳定范围，这意味着可能会有c d d c d c输出在下面的输出转储中，因为这是元素以相同频率进入原始数组的顺序。当flag为True时，此类元素还按原始数组的值排序，结果为c c c d d d。换句话说，在False的情况下，我们仅按键freq进行稳定排序，当它为True时，我们按(freq,value)进行升序排序，并按(-freq,value)进行降序排序。

Try it online!

import string,math
import numpy as np
np.random.seed(0)
# Generating input data
hi,n,desc = 7,25,True
letters = np.array(list(string.ascii_letters),dtype = np.object_)[:hi]
a = np.random.choice(letters,(n,),p = (
    lambda p = np.random.random((letters.size,)): p / p.sum()
)())
for equal_order_by_val in [False,True]:
    # Solving task
    us,ui,cs = np.unique(a,return_inverse = True,return_counts = True)
    af = cs[ui]
    sort_key = -af if desc else af
    if equal_order_by_val:
        shift_bits = max(1,math.ceil(math.log(us.size) / math.log(2)))
        sort_key = ((sort_key.astype(np.int64) << shift_bits) +
            np.arange(us.size,dtype = np.int64)[ui])
    ix = np.argsort(sort_key,kind = ''stable'') # Do sorting itself
    # Printing results
    print(''\\nequal_order_by_val:'',equal_order_by_val)
    for name,val in [
        (''i_col'',np.arange(n)),(''original_a'',a),(''freqs'',af),(''sorted_a'',a[ix]),(''sorted_freqs'',af[ix]),(''sorting_ix'',ix),]:
        print(name.rjust(12),'' ''.join([str(e).rjust(2) for e in val]))

输出：

equal_order_by_val: False
       i_col  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
  original_a  g  g  c  f  d  d  g  a  a  a  f  f  f  g  f  c  f  a  e  b  g  d  c  b  f
       freqs  5  5  3  7  3  3  5  4  4  4  7  7  7  5  7  3  7  4  1  2  5  3  3  2  7
    sorted_a  f  f  f  f  f  f  f  g  g  g  g  g  a  a  a  a  c  d  d  c  d  c  b  b  e
sorted_freqs  7  7  7  7  7  7  7  5  5  5  5  5  4  4  4  4  3  3  3  3  3  3  2  2  1
  sorting_ix  3 10 11 12 14 16 24  0  1  6 13 20  7  8  9 17  2  4  5 15 21 22 19 23 18
equal_order_by_val: True
       i_col  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
  original_a  g  g  c  f  d  d  g  a  a  a  f  f  f  g  f  c  f  a  e  b  g  d  c  b  f
       freqs  5  5  3  7  3  3  5  4  4  4  7  7  7  5  7  3  7  4  1  2  5  3  3  2  7
    sorted_a  f  f  f  f  f  f  f  g  g  g  g  g  a  a  a  a  c  c  c  d  d  d  b  b  e
sorted_freqs  7  7  7  7  7  7  7  5  5  5  5  5  4  4  4  4  3  3  3  3  3  3  2  2  1
  sorting_ix  3 10 11 12 14 16 24  0  1  6 13 20  7  8  9 17  2 15 22  4  5 21 19 23 18

"import numpy as np" ImportError: No module named numpy

问题：没有安装 numpy

解决方法：

下载文件，安装

numpy-1.8.2-win32-superpack-python2.7

安装运行 import numpy，出现

Traceback (most recent call last):
  File "<pyshell#2>", line 1, in <module>
    import numpy
  File "C:\Python27\lib\site-packages\numpy\__init__.py", line 153, in <module>
    from . import add_newdocs
  File "C:\Python27\lib\site-packages\numpy\add_newdocs.py", line 13, in <module>
    from numpy.lib import add_newdoc
  File "C:\Python27\lib\site-packages\numpy\lib\__init__.py", line 8, in <module>
    from .type_check import *
  File "C:\Python27\lib\site-packages\numpy\lib\type_check.py", line 11, in <module>
    import numpy.core.numeric as _nx
  File "C:\Python27\lib\site-packages\numpy\core\__init__.py", line 6, in <module>
    from . import multiarray
ImportError: DLL load failed: %1 不是有效的 Win32 应用程序。

原因是：python 装的是 64 位的，numpy 装的是 32 位的

重新安装 numpy 为：numpy-1.8.0-win64-py2.7

3.7Python 数据处理篇之 Numpy 系列 (七)---Numpy 的统计函数

前言

具体我们来学 Numpy 的统计函数

（一）函数一览表

调用方式：np.*

.sum(a)	对数组 a 求和
.mean(a)	求数学期望
.average(a)	求平均值
.std(a)	求标准差
.var(a)	求方差
.ptp(a)	求极差
.median(a)	求中值，即中位数
.min(a)	求最大值
.max（a)	求最小值
.argmin(a)	求最小值的下标，都处里为一维的下标
.argmax(a)	求最大值的下标，都处里为一维的下标
.unravel_index(index, shape)	g 根据 shape, 由一维的下标生成多维的下标

（二）统计函数 1

（1）说明

（2）输出

.sum(a)

.mean(a)

.average(a)

.std(a)

.var(a)

（三）统计函数 2

（1）说明

（2）输出

.max(a) .min(a)

.ptp(a)

.median(a)

.argmin(a)

.argmax(a)

.unravel_index(index,shape)

作者：Mark

日期：2019/02/11 周一

Anaconda Numpy 错误“Importing the Numpy C Extension Failed”是否有另一种解决方案

如何解决Anaconda Numpy 错误“Importing the Numpy C Extension Failed”是否有另一种解决方案？

希望有人能在这里提供帮助。我一直在绕圈子一段时间。我只是想设置一个 python 脚本，它将一些 json 数据从 REST API 加载到云数据库中。我在 Anaconda 上设置了一个虚拟环境（因为 GCP 库推荐这样做），安装了依赖项，现在我只是尝试导入库并向端点发送请求。我使用 Conda（和 conda-forge）来设置环境并安装依赖项，所以希望一切都干净。我正在使用带有 Python 扩展的 VS 编辑器作为编辑器。每当我尝试运行脚本时，我都会收到以下消息。我已经尝试了其他人在 Google/StackOverflow 上找到的所有解决方案，但没有一个有效。我通常使用 IDLE 或 Jupyter 进行脚本编写，没有任何问题，但我对 Anaconda、VS 或环境变量（似乎是相关的）没有太多经验。在此先感谢您的帮助！

  \Traceback (most recent call last):
File "C:\Conda\envs\gcp\lib\site-packages\numpy\core\__init__.py",line 22,in <module>
from . import multiarray
File "C:\Conda\envs\gcp\lib\site-packages\numpy\core\multiarray.py",line 12,in <module>
from . import overrides
File "C:\Conda\envs\gcp\lib\site-packages\numpy\core\overrides.py",line 7,in <module>
from numpy.core._multiarray_umath import (
ImportError: DLL load Failed while importing _multiarray_umath: The specified module Could not be found.

During handling of the above exception,another exception occurred:

Traceback (most recent call last):
File "c:\API\citi-bike.py",line 4,in <module>
import numpy as np
File "C:\Conda\envs\gcp\lib\site-packages\numpy\__init__.py",line 150,in <module>
from . import core
File "C:\Conda\envs\gcp\lib\site-packages\numpy\core\__init__.py",line 48,in <module>
raise ImportError(msg)
ImportError:

IMPORTANT: PLEASE READ THIS FOR ADVICE ON HOW TO SOLVE THIS ISSUE!

Importing the numpy C-extensions Failed. This error can happen for
many reasons,often due to issues with your setup or how NumPy was
installed.

We have compiled some common reasons and troubleshooting tips at:

https://numpy.org/devdocs/user/troubleshooting-importerror.html

Please note and check the following:

* The Python version is: python3.9 from "C:\Conda\envs\gcp\python.exe"
* The NumPy version is: "1.21.1"

and make sure that they are the versions you expect.
Please carefully study the documentation linked above for further help.

Original error was: DLL load Failed while importing _multiarray_umath: The specified module Could not be found.

解决方法

暂无找到可以解决该程序问题的有效方法，小编努力寻找整理中！

如果你已经找到好的解决方法，欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@）

Difference between import numpy and import numpy as np

up vote 18 down vote favorite

I understand that when possible one should use

import numpy as np

This helps keep away any conflict due to namespaces. But I have noticed that while the command below works

import numpy.f2py as myf2py

the following does not

import numpy as np
np.f2py #throws no module named f2py

Can someone please explain this?

python numpy

shareimprove this question

edited Mar 24 ''14 at 23:20

mu 無

24.7k104471

asked Mar 24 ''14 at 23:19

user1318806

3001311

@roippi have you tried exit your python and enter it and just do import numpy then numpy.f2py ? It throws an error in my case too – aha Mar 24 ''14 at 23:24

Importing a module doesn''t import sub-modules. You need to explicitly import the numpy.f2py module regardless of whether or not/how numpy itself has been imported. – alecb Mar 24 ''14 at 23:39

add a comment

4 Answers

active oldest votes

up vote 13 down vote

numpy is the top package name, and doing import numpy doesn''t import submodule numpy.f2py.

When you do import numpy it creats a link that points to numpy, but numpy is not further linked to f2py. The link is established when you do import numpy.f2py

In your above code:

import numpy as np # np is an alias pointing to numpy, but at this point numpy is not linked to numpy.f2py
import numpy.f2py as myf2py # this command makes numpy link to numpy.f2py. myf2py is another alias pointing to numpy.f2py as well

Here is the difference between import numpy.f2py and import numpy.f2py as myf2py:

import numpy.f2py
- put numpy into local symbol table(pointing to numpy), and numpy is linked to numpy.f2py
- both numpy and numpy.f2py are accessible
import numpy.f2py as myf2py
- put my2py into local symbol table(pointing to numpy.f2py)
- Its parent numpy is not added into local symbol table. Therefore you can not access numpy directly

shareimprove this answer

edited Mar 25 ''14 at 0:31

answered Mar 24 ''14 at 23:33

aha

1,2291718

add a comment

up vote 7 down vote

The import as syntax was introduced in PEP 221 and is well documented there.

When you import a module via

import numpy

the numpy package is bound to the local variable numpy. The import as syntax simply allows you to bind the import to the local variable name of your choice (usually to avoid name collisions, shorten verbose module names, or standardize access to modules with compatible APIs).

Thus,

import numpy as np

is equivalent to,

import numpy
np = numpy
del numpy

When trying to understand this mechanism, it''s worth remembering that import numpy actually means import numpy as numpy.

When importing a submodule, you must refer to the full parent module name, since the importing mechanics happen at a higher level than the local variable scope. i.e.

import numpy as np
import numpy.f2py   # OK
import np.f2py      # ImportError

I also take issue with your assertion that "where possible one should [import numpy as np]". This is done for historical reasons, mostly because people get tired very quickly of prefixing every operation with numpy. It has never prevented a name collision for me (laziness of programmers actually suggests there''s a higher probability of causing a collision with np)

Finally, to round out my exposé, here are 2 interesting uses of the import as mechanism that you should be aware of:

1. long subimports

import scipy.ndimage.interpolation as warp
warp.affine_transform(I, ...)

2. compatible APIs

try:
    import pyfftw.interfaces.numpy_fft as fft
except:
    import numpy.fft as fft
# call fft.ifft(If) with fftw or the numpy fallback under a common name

shareimprove this answer

answered Mar 25 ''14 at 0:59

hbristow

68345

add a comment

up vote 1 down vote

numpy.f2py is actually a submodule of numpy, and therefore has to be imported separately from numpy. As aha said before:

When you do import numpy it creats a link that points to numpy, but numpy is not further linked to f2py. The link is established when you do import numpy.f2py

when you call the statement import numpy as np, you are shortening the phrase "numpy" to "np" to make your code easier to read. It also helps to avoid namespace issues. (tkinter and ttk are a good example of what can happen when you do have that issue. The UIs look extremely different.)

shareimprove this answer

answered Mar 24 ''14 at 23:47

bspymaster

760923

add a comment

up vote 1 down vote

This is a language feature. f2py is a subpackage of the module numpy and must be loaded separately.

This feature allows:

you to load from numpy only the packages you need, speeding up execution.
the developers of f2py to have namespace separation from the developers of another subpackage.

Notice however that import numpy.f2py or its variant import numpy.f2py as myf2py are still loading the parent module numpy.

Said that, when you run

import numpy as np
np.f2py

You receive an AttributeError because f2py is not an attribute of numpy, because the __init__() of the package numpy did not declare in its scope anything about the subpackage f2py.

shareimprove this answer

answered Mar 24 ''14 at 23:57

gg349

7,67321739

when you do import numpy.f2py as myf2py, how do you access its parent numpy? it seems import numpy.f2py allows you to access its parent numpy, but import numpy.f2py as myf2py doesn''t – aha Mar 25 ''14 at 0:00

You don''t access it because you decided you didn''t want to use anything from numpy, and you only care of using the subpackage. It is similar to using from foo import bar: the name foo will not be accessible. See the comment after the first example of the docs, LINK – gg349 Mar 25 ''14 at 0:05

add a comment

关于NumPy-按频率快速稳定地对大型数组进行arg排序和数组按照频率排序的问题就给大家分享到这里，感谢你花时间阅读本站内容，更多关于"import numpy as np" ImportError: No module named numpy、3.7Python 数据处理篇之 Numpy 系列 (七)---Numpy 的统计函数、Anaconda Numpy 错误“Importing the Numpy C Extension Failed”是否有另一种解决方案、Difference between import numpy and import numpy as np等相关知识的信息别忘了在本站进行查找喔。

本文标签：