将.RData文件加载到Python中（rdata文件转化为csv）

25-03-12 8

本文的目的是介绍将.RData文件加载到Python中的详细情况，特别关注rdata文件转化为csv的相关信息。我们将通过专业的研究、有关数据的分析等多种方式，为您呈现一个全面的了解将.RData文件

本文的目的是介绍将.RData文件加载到Python中的详细情况，特别关注rdata文件转化为csv的相关信息。我们将通过专业的研究、有关数据的分析等多种方式，为您呈现一个全面的了解将.RData文件加载到Python中的机会，同时也不会遗漏关于node.js – 找不到Python可执行文件“/path/to/executable/python2.7”,可以设置PYTHON env变量、python – Flask：使用全局变量将数据文件加载到内存中、python – 如何将多个gpx文件加载到PostGIS中？、Python-将stdout重定向到Python中的文件？的知识。

本文目录一览：

将.RData文件加载到Python中（rdata文件转化为csv）
node.js – 找不到Python可执行文件“/path/to/executable/python2.7”,可以设置PYTHON env变量
python – Flask：使用全局变量将数据文件加载到内存中
python – 如何将多个gpx文件加载到PostGIS中？
Python-将stdout重定向到Python中的文件？

将.RData文件加载到Python中（rdata文件转化为csv）

我有一堆.RData时间序列文件，想直接将它们加载到Python中，而无需先将文件转换为其他扩展名（例如.csv）。对实现此目标的最佳方法有何想法？

答案1

小编典典

人们在R-help和R-dev列表上问这种事情，通常的答案是代码是.RData文件格式的文档。因此，任何其他语言的任何其他实现都是 hard
++ 。

我认为唯一合理的方法是安装RPy2并从中使用R的load功能，并随即转换为适当的python对象。该.RData文件可以包含结构化对象以及普通表，因此请当心。

友情链接：http：
//rpy.sourceforge.net/rpy2/doc-2.4/html/

速成：

>>> import rpy2.robjects as robjects>>> robjects.r[''load''](".RData")

现在将对象加载到R工作区中。

>>> robjects.r[''y'']<FloatVector - Python:0x24c6560 / R:0xf1f0e0>[0.763684, 0.086314, 0.617097, ..., 0.443631, 0.281865, 0.839317]

那是一个简单的标量，d是一个数据帧，我可以子集化以得到列：

>>> robjects.r[''d''][0]<IntVector - Python:0x24c9248 / R:0xbbc6c0>[       1,        2,        3, ...,        8,        9,       10]>>> robjects.r[''d''][1]<FloatVector - Python:0x24c93b0 / R:0xf1f230>[0.975648, 0.597036, 0.254840, ..., 0.891975, 0.824879, 0.870136]

node.js – 找不到Python可执行文件“/path/to/executable/python2.7”,可以设置PYTHON env变量

bufferutil@1.2.1 install /home/sudthenerd/polymer-starter-kit-1.2.1/node_modules/bufferutil > node-gyp rebuild gyp ERR! configure error gyp ERR! stack Error: Can’t find Python executable “/path/to/executable/python2.7”,you can set the PYTHON env variable. gyp ERR! stack at failnopython (/usr/lib/node_modules/npm/node_modules/node-gyp/lib/configure.js:401:14) gyp ERR! stack at /usr/lib/node_modules/npm/node_modules/node-gyp/lib/configure.js:330:11 gyp ERR! stack at F (/usr/lib/node_modules/npm/node_modules/which/which.js:78:16) gyp ERR! stack at E (/usr/lib/node_modules/npm/node_modules/which/which.js:82:29) gyp ERR! stack at /usr/lib/node_modules/npm/node_modules/which/which.js:93:16 gyp ERR! stack at FSReqWrap.oncomplete (fs.js:82:15) gyp ERR! System Linux 3.13.0-74-generic gyp ERR! command “/usr/bin/nodejs” “/usr/lib/node_modules/npm/node_modules/node-gyp/bin/node-gyp.js” “rebuild” gyp ERR! cwd /home/sudthenerd/polymer-starter-kit-1.2.1/node_modules/bufferutil gyp ERR! node -v v5.3.0 gyp ERR! node-gyp -v v3.2.1 gyp ERR! not ok npm WARN install:bufferutil@1.2.1 bufferutil@1.2.1 install: node-gyp rebuild npm WARN install:bufferutil@1.2.1 Exit status 1 > utf-8-validate@1.2.1 install /home/sudthenerd/polymer-starter-kit-1.2.1/node_modules/utf-8-validate > node-gyp rebuild gyp ERR! configure error gyp ERR! stack Error: Can’t find Python executable “/path/to/executable/python2.7”,you can set the PYTHON env variable. gyp ERR! stack at failnopython (/usr/lib/node_modules/npm/node_modules/node-gyp/lib/configure.js:401:14) gyp ERR! stack at /usr/lib/node_modules/npm/node_modules/node-gyp/lib/configure.js:330:11 gyp ERR! stack at F (/usr/lib/node_modules/npm/node_modules/which/which.js:78:16) gyp ERR! stack at E (/usr/lib/node_modules/npm/node_modules/which/which.js:82:29) gyp ERR! stack at /usr/lib/node_modules/npm/node_modules/which/which.js:93:16 gyp ERR! stack at FSReqWrap.oncomplete (fs.js:82:15) gyp ERR! System Linux 3.13.0-74-generic gyp ERR! command “/usr/bin/nodejs” “/usr/lib/node_modules/npm/node_modules/node-gyp/bin/node-gyp.js” “rebuild” gyp ERR! cwd /home/sudthenerd/polymer-starter-kit-1.2.1/node_modules/utf-8-validate gyp ERR! node -v v5.3.0 gyp ERR! node-gyp -v v3.2.1 gyp ERR! not ok npm WARN install:utf-8-validate@1.2.1 utf-8-validate@1.2.1 install: node-gyp rebuild npm WARN install:utf-8-validate@1.2.1 Exit status 1

解决方法

斯科特弗里斯的解决方案对我而言并不适合我

npm config set python $(哪个python)

没有.

python – Flask：使用全局变量将数据文件加载到内存中

我有一个大的 XML文件,它被打开,加载到内存中,然后由 Python类关闭.简化示例如下所示：

class Dictionary():
   def __init__(self,filename):
      f = open(filename)
      self.contents = f.readlines()
      f.close()

   def getDeFinitionForWord(self,word):
      # returns a word,using etree parser

在我的Flask应用程序中：

from dictionary import Dictionary
dictionary = Dictionary('dictionary.xml')
print 'dictionary object created'

@app.route('/')
def home():
   word = dictionary.getDeFinitionForWord('help')

我理解在理想的世界中,我会使用数据库而不是XML,并在每次请求时建立与此数据库的新连接.

我从文档中了解到,Flask中的应用程序上下文意味着每个请求都会导致重新创建dictionary = new Dictionary(‘dictionary.xml’),因此在磁盘上打开一个文件并将整个内容重新读入内存.但是,当我查看调试输出时,我看到创建的字典对象只打印了一次,尽管从多个源(不同的会话？)连接.

我的第一个问题是：

因为我的应用程序似乎只加载XML文件一次…然后我可以假设它全局驻留在内存中,并且可以通过大量的同时请求安全地读取,仅限于我服务器上的RAM – 对？如果XML是50MB,那么大约需要.内存50MB,可以高速同步请求…我猜这并不容易.

我的第二个问题是：

如果不是这样的话,那么我对处理大量流量的能力有什么限制？如果我重复打开50MB XML,从磁盘读取并关闭,我可以处理多少个请求？我一次假设一个.

我意识到这是模糊的,依赖于硬件,但我是Flask,python和网络编程的新手,只是寻找指导.

谢谢！

解决方法

只要不修改全局对象,就可以安全地保持这种状态.这是一个Wsgi功能,如Werkzeug docs 1(Flask建立在其上的库)中所述.

该数据将保存在Wsgi应用服务器的每个工作进程的内存中.这并不意味着一次,但进程数(工作者)很小且不变(不依赖于会话数或流量).

所以,有可能保持这种方式.

也就是说,我会在你的地方使用一个合适的数据库.如果您有16名工作人员,则您的数据将至少占用800 MB的RAM(工作人员数量通常是处理器数量的两倍).如果XML增长并且您最终决定使用数据库服务,则需要重写代码.

如果保留内存的原因是Postgresql和MysqL太慢,你可以使用保存在内存文件系统中的sqlite,如TMPFS的RAMFS.它为您提供了速度,sql界面,您可能会节省RAM使用率.迁移到Postgresql或MysqL也会更容易(就代码而言).

python – 如何将多个gpx文件加载到PostGIS中？

我有一堆来自 GPSLogger for Android应用程序的gpx文件.

文件看起来像：

<?xml version="1.0" encoding="UTF-8"?>
<gpx  version="1.0" creator="GPSLogger - http://gpslogger.mendhak.com/"  
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"  
  xmlns="http://www.topografix.com/GPX/1/0" 
  xsi:schemaLocation="http://www.topografix.com/GPX/1/0 
    http://www.topografix.com/GPX/1/0/gpx.xsd" >
  <time>2011-08-26T06:25:20Z</time>
  <bounds></bounds>
  <trk>
    <trkseg>
      <trkpt  lat="46.94681501102746"  lon="7.398453755309032" >
        <ele>634.0</ele>
        <speed>0.0</speed>
        <src>gps</src>
        <sat>6</sat>
        <time>2011-08-26T06:25:20Z</time>
      </trkpt>
      <trkpt  lat="46.94758878281887"  lon="7.398622951942811" >
        <ele>748.0</ele>
        <speed>0.0</speed>
        <src>gps</src>
        <sat>5</sat>
        <time>2011-08-26T06:30:56Z</time>
      </trkpt>

...   ...   ...

    </trkseg>
  </trk>
</gpx>

是否可以遍历包含这些文件的目录并使用sql或Python将它们加载到一个PostGIS表中？

我在this博客文章中提到过：

I’m not aware of anything that can convert straight from GPX to
PostGIS

This post给出了一个使用sql来做到这一点的例子,但我无法理解代码：/

解决方法

ogr2ogr(GDAL的一部分)是一个简单直接的Unix shell工具,用于将GPX文件加载到PostGIS中.

ogr2ogr -append -f Postgresql PG:dbname=walks walk.gpx

ogr2ogr在PostGIS中使用自己的模式创建自己的数据库表.桌面轨道每个GPS轨道有一行; tracks.wkb_geometry包含GPS轨道本身作为MultiLinestring.表track_points包含各个位置修复(带有时间戳).

以下是导入前数据库遍历的内容：

walks=# \d
               List of relations
 Schema |       Name        | Type  |  Owner   
--------+-------------------+-------+----------
 public | geography_columns | view  | postgres
 public | geometry_columns  | view  | postgres
 public | raster_columns    | view  | postgres
 public | raster_overviews  | view  | postgres
 public | spatial_ref_sys   | table | postgres
(5 rows)

……并在导入后：

walks=# \d
                    List of relations
 Schema |           Name           |   Type   |  Owner   
--------+--------------------------+----------+----------
 public | geography_columns        | view     | postgres
 public | geometry_columns         | view     | postgres
 public | raster_columns           | view     | postgres
 public | raster_overviews         | view     | postgres
 public | route_points             | table    | postgres
 public | route_points_ogc_fid_seq | sequence | postgres
 public | routes                   | table    | postgres
 public | routes_ogc_fid_seq       | sequence | postgres
 public | spatial_ref_sys          | table    | postgres
 public | track_points             | table    | postgres
 public | track_points_ogc_fid_seq | sequence | postgres
 public | tracks                   | table    | postgres
 public | tracks_ogc_fid_seq       | sequence | postgres
 public | waypoints                | table    | postgres
 public | waypoints_ogc_fid_seq    | sequence | postgres
(15 rows)

Python-将stdout重定向到Python中的文件？

如何在Python中将stdout重定向到任意文件？

当从ssh会话中启动运行了很长时间的Python脚本（例如，Web应用程序）并进行背景调整，并且ssh会话关闭时，该应用程序将在尝试写入stdout时引发IOError并失败。我需要找到一种方法来使应用程序和模块输出到文件而不是stdout，以防止由于IOError而导致失败。当前，我使用nohup将输出重定向到文件，并且可以完成工作，但是我想知道是否有一种出于好奇而无需使用nohup的方法。

我已经尝试过了sys.stdout = open(''somefile'', ''w'')，但是这似乎并不能阻止某些外部模块仍输出到终端（或者sys.stdout = ...线路根本没有触发）。我知道它应该可以通过我测试过的简单脚本工作，但是我还没有时间在Web应用程序上进行测试。

答案1

小编典典

如果要在Python脚本中进行重定向，则设置sys.stdout为文件对象可以解决问题：

import syssys.stdout = open(''file'', ''w'')print(''test'')

一种更常见的方法是在执行时使用外壳重定向（与Windows和Linux相同）：

$ python foo.py > file

今天关于将.RData文件加载到Python中和rdata文件转化为csv的介绍到此结束，谢谢您的阅读，有关node.js – 找不到Python可执行文件“/path/to/executable/python2.7”,可以设置PYTHON env变量、python – Flask：使用全局变量将数据文件加载到内存中、python – 如何将多个gpx文件加载到PostGIS中？、Python-将stdout重定向到Python中的文件？等更多相关知识的信息可以在本站进行查询。

本文标签：