Python文件练习_读取文件并计算平均分（python读取文件并计算平均值）

25-02-10 18

在这篇文章中，我们将带领您了解Python文件练习_读取文件并计算平均分的全貌，包括python读取文件并计算平均值的相关情况。同时，我们还将为您介绍有关Django/Python：如何读取文件并确认

在这篇文章中，我们将带领您了解Python文件练习_读取文件并计算平均分的全貌，包括python读取文件并计算平均值的相关情况。同时，我们还将为您介绍有关Django / Python：如何读取文件并确认它是音频文件？、Pyspark如何遍历目录获取文件并计算行数、python – 存储最后3个分数并删除旧分数并计算平均值？、Python-读取文件并以分号分隔行的最佳方法的知识，以帮助您更好地理解这个主题。

本文目录一览：

Python文件练习_读取文件并计算平均分（python读取文件并计算平均值）
Django / Python：如何读取文件并确认它是音频文件？
Pyspark如何遍历目录获取文件并计算行数
python – 存储最后3个分数并删除旧分数并计算平均值？
Python-读取文件并以分号分隔行的最佳方法

Python文件练习_读取文件并计算平均分（python读取文件并计算平均值）

读取文件并计算平均分

文件如下

小白，88
小黑，90.5
小黄，
小花，33

第一次完成功能

score = []
total = 0
count = 0
with open(''成绩'',encoding=''utf-8'') as f: #以自动关闭文件的方法打开文件
    for line in f: #逐行循环文件，避免一次读取占用电脑内存
        score.append(line.split(''，'')[1].strip()) #将分数取出放入列表
    for i in score:
        total += int(score[count]) #计算总分
        count += 1
    average = total/count
    print(''平均成绩是%s''%average)

第二次优化

(1) 可以边取成绩边统计，不需要额外创建数组再循环list，可以少一个循环

(2) 添加非空校验,数组非空（如果数组只有名字没有，成绩），空字符串传转化为整型或者浮点型时会报错

(3) 成绩不一定为整型，将数据调整为float

(4) 平均分取小数点后两位

sum = 0
count = 0
with open(''成绩'',encoding=''utf-8'') as f:
    for line in f:
        if len(line.split(''，''))>1: #添加数组非空判断
            if line.split(''，'')[1].strip(): #非空判断，空字符传转化为整型或者浮点型时会报错
            # 没有成绩，只计人数不计成绩,
                sum += float(line.split(''，'')[1].strip()) #由int改为float，可以防止分数中有小数
        count += 1
    average = sum / count
    print(''总分：{},人数：{},平均分是：{:.2f}''.format(sum,count,average))#平均分取小数点后两位

Django / Python：如何读取文件并确认它是音频文件？

我正在构建一个Web应用程序，用户可以在其中上传媒体内容，包括音频文件。

我的 AudioFileUploadForm 中有一个干净的方法可以验证以下内容：

音频文件不是太大。
音频文件具有有效的content_type（MIME类型）。
音频文件具有有效的扩展名。

但是，我担心安全性 。用户可以上传带有恶意代码的文件，并轻松通过上述验证。接下来，我要验证音频文件确实是音频文件（在将其写入磁盘之前）。

我应该怎么做？

class UploadAudioForm(forms.ModelForm):    audio_file = forms.FileField()    def clean_audio_file(self):        file = self.cleaned_data.get(''audio_file'',False):            if file:                if file._size > 12*1024*1024:                    raise ValidationError("Audio file too large ( > 12mb )")                if not file.content_type in [''audio/mpeg'',''audio/mp4'', ''audio/basic'', ''audio/x-midi'', ''audio/vorbis'', ''audio/x-pn-realaudio'', ''audio/vnd.rn-realaudio'', ''audio/x-pn-realaudio'', ''audio/vnd.rn-realaudio'', ''audio/wav'', ''audio/x-wav'']:                    raise ValidationError("Sorry, we do not support that audio MIME type. Please try uploading an mp3 file, or other common audio type.")                if not os.path.splitext(file.name)[1] in [''.mp3'', ''.au'', ''.midi'', ''.ogg'', ''.ra'', ''.ram'', ''.wav'']:                    raise ValidationError("Sorry, your audio file doesn''t have a proper extension.")                # Next, I want to read the file and make sure it is                 # a valid audio file. How should I do this? Use a library?                # Read a portion of the file? ...?                if not ???.is_audio(file.content):                    raise ValidationError("Not a valid audio file.")                return file            else:                raise ValidationError("Couldn''t read uploaded file")

编辑：通过“验证音频文件，的确是一个音频文件”，我的意思是：

包含音频文件典型数据的文件。我担心用户会上传带有适当标题的文件，并使用恶意脚本代替音频数据。例如…
mp3文件是mp3文件吗？还是它包含了mp3文件没有的特征？

答案1

小编典典

替代其他发布的答案进行header解析。这意味着某人仍然可以在有效标头后面包含其他数据。

就是要验证整个文件，它花费更多的CPU但也有更严格的策略。可以做到这一点的库是python
audiotools，相关的API方法是AudioFile.verify。

像这样使用：

import audiotoolsf = audiotools.open(filename)try:    result = f.verify()except audiotools.InvalidFile:    # Invalid file.    print("Invalid File")else:    # Valid file.    print("Valid File")

一个警告是，这种verify方法是非常严格的，而且实际上严重标志编码的文件为无效。您必须自行决定这是否适合您的用例。

Pyspark如何遍历目录获取文件并计算行数

如何解决Pyspark如何遍历目录获取文件并计算行数？

我正在尝试遍历 hdfs 目录及其子目录以获取 csv 文件并计算每个文件中的行数。我正在尝试以下代码片段，但它不断向我抛出错误“IllegalArgumentException：''Pathname /hdfs:/data/msd from /hdfs:/data/msd is not a valid DFS filename.''”

hadoop = sc._jvm.org.apache.hadoop
fs = hadoop.fs.FileSystem
conf = hadoop.conf.Configuration() 
path = hadoop.fs.Path("/hdfs:///data/msd")

for f in fs.get(conf).listStatus(path):
    print(f.getPath(),f.getLen())

enter image description here

解决方法

只需从您的路径中删除第一个斜杠即可。应该是hdfs:///data/msd

python – 存储最后3个分数并删除旧分数并计算平均值？

我正在制作一个打开并读取csv文件的程序,并按以下方式排序：

>按字母顺序排列,每位学生得分最高.
>以最高分,从最高到最低.
>平均得分,从最高到最低.

该计划应存储每个学生的最后3个分数.这是我坚持并需要帮助的部分.按字母顺序对文件进行排序时,程序需要查看每个学生最近3个最近的分数并选择最高分数.目前,我的代码只按字母顺序对文件进行排序.它会查看最近的3个分数并选择最高分.这是我需要帮助的地方.

我的代码已经将分数从最高到最低排序,但是它打印出每个学生获得的所有分数,而不是从他们最近的3分中打出最高分.

Andrew 1
Andrew 2
Andrew 3
Andrew 4
Andrew 5

最后,我需要帮助计算每个学生的平均分数.我猜它应该做的方式是,加上安德鲁的最后3分,分别是5分,4分和3分,除以3分.

这是我的代码：

import csv,operator

selected_class = input("Pick a class file,(5,6 or 7)? ")

print("1. Alphabetical order.")
print("2. Highest to lowest.")
print("3. Average score.")

selected_sorting = input("Pick an option 1,2,or 3: ")

class_file = "Class " + selected_class + ".csv"
open_file = open(class_file)
csv_file = csv.reader(open_file)

if selected_sorting == "1":
    sorted_name = sorted(csv_file,key=operator.itemgetter(0))
    for i in sorted_name:
        print(i)

elif selected_sorting == "2":
    sorted_results = sorted(csv_file,key=lambda row: int(row[1]),reverse=True)
    for i in sorted_results:
        print(i)

elif selected_sorting == "3":

解决方法

我将给出一些演示代码：

# -*- coding: utf-8 -*-
import csv
from collections import defaultdict
from statistics import mean

class_file = ''scores.csv''
open_file = open(class_file)
csv_file = csv.reader(open_file)


def main():
    # First,use student name to group by all scores,this will
    # generate structure like this:
    # {
    #     ''Andrew'': [1,3,4,5]),#     ''Luck'': [10,20]),# }
    score_groups = defaultdict(list)
    for name,score in csv_file:
        score_groups[name].append(int(score))

    # Secondary,use the 3 latest socres only 
    l3_score_groups = [(key,value[-3:]) for key,value in score_groups.items()]

    print(''1. Alphabetical order with each students highest score.'')
    l3_highest_score_groups = [(key,max(values)) for key,values in l3_score_groups]
    for name,score in sorted(l3_highest_score_groups,key=lambda x: x[0]):
        print(name,score)

    print(''2. By the highest score,highest to lowest.'')
    l3_highest_score_groups = [(key,key=lambda x: x[1],reverse=True):
        print(name,score)

    print(''3. Average score,highest to lowest.'')
    l3_aver_score_groups = [(key,mean(values)) for key,score in sorted(l3_aver_score_groups,score)


if __name__ == ''__main__'':
    main()

以下是上面使用的技术：

> collections.defaultdict：进行数据分组工作时有用的数据结构.
> list-comprehensions：用于更改/过滤可迭代数据的强大工具.
> statistics.mean：计算列表的平均值.

希望能帮助到你.

Python-读取文件并以分号分隔行的最佳方法

读取文件并以分号分隔行的最佳方法是什么。返回的数据应该是一个元组列表。

可以打败这种方法吗？可以这样做更快/使用更少的内存吗？

def readfile(filepath,delim):
    with open(filepath,'r') as f:
        return [tuple(line.split(delim)) for line in f]

我们今天的关于Python文件练习_读取文件并计算平均分和python读取文件并计算平均值的分享就到这里，谢谢您的阅读，如果想了解更多关于Django / Python：如何读取文件并确认它是音频文件？、Pyspark如何遍历目录获取文件并计算行数、python – 存储最后3个分数并删除旧分数并计算平均值？、Python-读取文件并以分号分隔行的最佳方法的相关信息，可以在本站进行搜索。

本文标签：