GVKun编程网logo

Scrapy官方文档系列——下载图片及图片处理(scrapy 图片下载)

22

在本文中,您将会了解到关于Scrapy官方文档系列——下载图片及图片处理的新资讯,同时我们还将为您解释scrapy图片下载的相关在本文中,我们将带你探索Scrapy官方文档系列——下载图片及图片处理的

在本文中,您将会了解到关于Scrapy官方文档系列——下载图片及图片处理的新资讯,同时我们还将为您解释scrapy 图片下载的相关在本文中,我们将带你探索Scrapy官方文档系列——下载图片及图片处理的奥秘,分析scrapy 图片下载的特点,并给出一些关于4、web爬虫,scrapy模块标签选择器下载图片,以及正则匹配标签、angular.js+node.js实现下载图片处理详解、Haproxy官方文档翻译(第三章)全局参数(1) 附英文原文、IOS 图片上传处理 图片压缩 图片处理的实用技巧。

本文目录一览:

Scrapy官方文档系列——下载图片及图片处理(scrapy 图片下载)

Scrapy官方文档系列——下载图片及图片处理(scrapy 图片下载)

Scrapy provides an item pipeline for downloading images attached to a particular item, for example, when you scrape products and also want to download their images locally.

This pipeline, called the Images Pipeline and implemented in the ImagesPipeline class, provides a convenient way for downloading and storing images locally with some additional features:

  • Convert all downloaded images to a common format (JPG) and mode (RGB)

  • Avoid re-downloading images which were downloaded recently

  • Thumbnail generation

  • Check images width/height to make sure they meet a minimum constraint

This pipeline also keeps an internal queue of those images which are currently being scheduled for download, and connects those items that arrive containing the same image, to that queue. This avoids downloading the same image more than once when it’s shared by several items.

The Python Imaging Library is used for thumbnailing and normalizing images to JPEG/RGB format, so you need to install that library in order to use the images pipeline.

Using the Images Pipeline

The typical workflow, when using the ImagesPipeline goes like this:

  1. In a Spider, you scrape an item and put the URLs of its images into a image_urls field.

  2. The item is returned from the spider and goes to the item pipeline.

  3. When the item reaches the ImagesPipeline, the URLs in the image_urls field are scheduled for download using the standard Scrapy scheduler and downloader (which means the scheduler and downloader middlewares are reused), but with a higher priority, processing them before other pages are scraped. The item remains “locked” at that particular pipeline stage until the images have finish downloading (or fail for some reason).

  4. When the images are downloaded another field (images) will be populated with the results. This field will contain a list of dicts with information about the images downloaded, such as the downloaded path, the original scraped url (taken from the image_urls field) , and the image checksum. The images in the list of the images field will retain the same order of the original image_urls field. If some image failed downloading, an error will be logged and the image won’t be present in the images field.

Usage example

In order to use the image pipeline you just need to enable it and define an item with the image_urls and images fields:

from scrapy.item import Item

class MyItem(Item):

    # ... other item fields ...
    image_urls = Field()
    images = Field()

If you need something more complex and want to override the custom images pipeline behaviour, see Implementing your custom Images Pipeline.

Enabling your Images Pipeline

To enable your images pipeline you must first add it to your project ITEM_PIPELINES setting:

ITEM_PIPELINES = [''scrapy.contrib.pipeline.images.ImagesPipeline'']

And set the IMAGES_STORE setting to a valid directory that will be used for storing the downloaded images. Otherwise the pipeline will remain disabled, even if you include it in the ITEM_PIPELINES setting.

For example:

IMAGES_STORE = ''/path/to/valid/dir''

Images Storage

File system is currently the only officially supported storage, but there is also (undocumented) support for Amazon S3.

File system storage

The images are stored in files (one per image), using a SHA1 hash of their URLs for the file names.

For example, the following image URL:

http://www.example.com/image.jpg

Whose SHA1 hash is:

3afec3b4765f8f0a07b78f98c07b83f013567a0a

Will be downloaded and stored in the following file:

<IMAGES_STORE>/full/3afec3b4765f8f0a07b78f98c07b83f013567a0a.jpg

Where:

  • <IMAGES_STORE> is the directory defined in IMAGES_STORE setting

  • full is a sub-directory to separate full images from thumbnails (if used). For more info see Thumbnail generation.

Additional features

Image expiration

The Image Pipeline avoids downloading images that were downloaded recently. To adjust this retention delay use the IMAGES_EXPIRES setting, which specifies the delay in number of days:

# 90 days of delay for image expiration
IMAGES_EXPIRES = 90

Thumbnail generation

The Images Pipeline can automatically create thumbnails of the downloaded images.

In order use this feature, you must set IMAGES_THUMBS to a dictionary where the keys are the thumbnail names and the values are their dimensions.

For example:

IMAGES_THUMBS = {
    ''small'': (50, 50),
    ''big'': (270, 270),
}

When you use this feature, the Images Pipeline will create thumbnails of the each specified size with this format:

<IMAGES_STORE>/thumbs/<size_name>/<image_id>.jpg

Where:

  • <size_name> is the one specified in the IMAGES_THUMBS dictionary keys (small, big, etc)

  • <image_id> is the SHA1 hash of the image url

Example of image files stored using small and big thumbnail names:

<IMAGES_STORE>/full/63bbfea82b8880ed33cdb762aa11fab722a90a24.jpg
<IMAGES_STORE>/thumbs/small/63bbfea82b8880ed33cdb762aa11fab722a90a24.jpg
<IMAGES_STORE>/thumbs/big/63bbfea82b8880ed33cdb762aa11fab722a90a24.jpg

The first one is the full image, as downloaded from the site.

Filtering out small images

You can drop images which are too small, by specifying the minimum allowed size in the IMAGES_MIN_HEIGHT and IMAGES_MIN_WIDTH settings.

For example:

IMAGES_MIN_HEIGHT = 110
IMAGES_MIN_WIDTH = 110

Note: these size constraints don’t affect thumbnail generation at all.

By default, there are no size constraints, so all images are processed.

Implementing your custom Images Pipeline

Here are the methods that you should override in your custom Images Pipeline:

  • class scrapy.contrib.pipeline.images.ImagesPipeline


    • item_completed(resultsitemsinfo)

    • The ImagesPipeline.item_completed() method called when all image requests for a single item have completed (either finished downloading, or failed for some reason).The item_completed() method must return the output that will be sent to subsequent item pipeline stages, so you must return (or drop) the item, as you would in any pipeline.

      Here is an example of the item_completed() method where we store the downloaded image paths (passed in results) in the image_paths item field, and we drop the item if it doesn’t contain any images:

      from scrapy.exceptions import DropItem
      
      def item_completed(self, results, item, info):
          image_paths = [x[''path''] for ok, x in results if ok]
          if not image_paths:
              raise DropItem("Item contains no images")
          item[''image_paths''] = image_paths
          return item

      By default, the item_completed() method returns the item.

    • success is a boolean which is True if the image was downloaded successfully or False if it failed for some reason

    • image_info_or_error is a dict containing the following keys (if success is True) or a Twisted Failure if there was a problem.

    • url - the url where the image was downloaded from. This is the url of the request returned from the get_media_requests() method.

    • path - the path (relative to IMAGES_STORE) where the image was stored

    • checksum - a MD5 hash of the image contents

    • get_media_requests(iteminfo)

    • As seen on the workflow, the pipeline will get the URLs of the images to download from the item. In order to do this, you must override theget_media_requests() method and return a Request for each image URL:

      def get_media_requests(self, item, info):
          for image_url in item[''image_urls'']:
              yield Request(image_url)

      Those requests will be processed by the pipeline and, when they have finished downloading, the results will be sent to the item_completed() method, as a list of 2-element tuples. Each tuple will contain (success, image_info_or_failure) where:

      The list of tuples received by item_completed() is guaranteed to retain the same order of the requests returned from the get_media_requests() method.

      Here’s a typical value of the results argument:

      [(True,
        {''checksum'': ''2b00042f7481c7b056c4b410d28f33cf'',
         ''path'': ''full/7d97e98f8af710c7e7fe703abc8f639e0ee507c4.jpg'',
         ''url'': ''http://www.example.com/images/product1.jpg''}),
       (True,
        {''checksum'': ''b9628c4ab9b595f72f280b90c4fd093d'',
         ''path'': ''full/1ca5879492b8fd606df1964ea3c1e2f4520f076f.jpg'',
         ''url'': ''http://www.example.com/images/product2.jpg''}),
       (False,
        Failure(...))]

      By default the get_media_requests() method returns None which means there are no images to download for the item.

Custom Images pipeline example

Here is a full example of the Images Pipeline whose methods are examplified above:

from scrapy.contrib.pipeline.images import ImagesPipeline
from scrapy.exceptions import DropItem
from scrapy.http import Request

class MyImagesPipeline(ImagesPipeline):

    def get_media_requests(self, item, info):
        for image_url in item[''image_urls'']:
            yield Request(image_url)

    def item_completed(self, results, item, info):
        image_paths = [x[''path''] for ok, x in results if ok]
        if not image_paths:
            raise DropItem("Item contains no images")
        item[''image_paths''] = image_paths
        return item


4、web爬虫,scrapy模块标签选择器下载图片,以及正则匹配标签

4、web爬虫,scrapy模块标签选择器下载图片,以及正则匹配标签

【百度云搜索,搜各种资料:http://bdy.lqkweb.com】
【搜网盘,搜各种资料:http://www.swpan.cn】

标签选择器对象

HtmlXPathSelector()创建标签选择器对象,参数接收response回调的html对象
需要导入模块:from scrapy.selector import HtmlXPathSelector

select()标签选择器方法,是HtmlXPathSelector里的一个方法,参数接收选择器规则,返回列表元素是一个标签对象

extract()获取到选择器过滤后的内容,返回列表元素是内容

选择器规则

  //x 表示向下查找n层指定标签,如://div 表示查找所有div标签
  /x 表示向下查找一层指定的标签
  /@x 表示查找指定属性,可以连缀如:@id @src
  [@] 表示查找指定属性等于指定值的标签,可以连缀 ,查找class名称等于指定名称的标签
  /text() 获取标签文本类容
  [x] 通过索引获取集合里的指定一个元素

获取指定的标签对象

# -*- coding: utf-8 -*-
import scrapy       #导入爬虫模块
from scrapy.selector import HtmlXPathSelector  #导入HtmlXPathSelector模块
from urllib import request                     #导入request模块
import os

class AdcSpider(scrapy.Spider):
    name = ''adc''                                        #设置爬虫名称
    allowed_domains = [''www.shaimn.com'']
    start_urls = [''http://www.shaimn.com/xinggan/'']

    def parse(self, response):
        hxs = HtmlXPathSelector(response)               #创建HtmlXPathSelector对象,将页面返回对象传进去

        items = hxs.select(''//div[@]/li'')  #标签选择器,表示获取所有class等于showlist的div,下面的li标签
        print(items)                                       #返回标签对象

image

image

循环获取到每个li标签里的子标签,以及各种属性或者文本

image

# -*- coding: utf-8 -*-
import scrapy       #导入爬虫模块
from scrapy.selector import HtmlXPathSelector  #导入HtmlXPathSelector模块
from urllib import request                     #导入request模块
import os

class AdcSpider(scrapy.Spider):
    name = ''adc''                                        #设置爬虫名称
    allowed_domains = [''www.shaimn.com'']
    start_urls = [''http://www.shaimn.com/xinggan/'']

    def parse(self, response):
        hxs = HtmlXPathSelector(response)               #创建HtmlXPathSelector对象,将页面返回对象传进去

        items = hxs.select(''//div[@]/li'')  #标签选择器,表示获取所有class等于showlist的div,下面的li标签
        # print(items)                                     #返回标签对象
        for i in range(len(items)):                        #根据li标签的长度循环次数
            title = hxs.select(''//div[@]/li[%d]//img/@alt'' % i).extract()   #根据循环的次数作为下标获取到当前li标签,下的img标签的alt属性内容
            src = hxs.select(''//div[@]/li[%d]//img/@src'' % i).extract()     #根据循环的次数作为下标获取到当前li标签,下的img标签的src属性内容
            if title and src:
                print(title,src)  #返回类容列表

image

将获取到的图片下载到本地

urlretrieve()将文件保存到本地,参数1要保存文件的src,参数2保存路径
urlretrieve是urllib下request模块的一个方法,需要导入from urllib import request

# -*- coding: utf-8 -*-
import scrapy       #导入爬虫模块
from scrapy.selector import HtmlXPathSelector  #导入HtmlXPathSelector模块
from urllib import request                     #导入request模块
import os

class AdcSpider(scrapy.Spider):
    name = ''adc''                                        #设置爬虫名称
    allowed_domains = [''www.shaimn.com'']
    start_urls = [''http://www.shaimn.com/xinggan/'']

    def parse(self, response):
        hxs = HtmlXPathSelector(response)               #创建HtmlXPathSelector对象,将页面返回对象传进去

        items = hxs.select(''//div[@]/li'')  #标签选择器,表示获取所有class等于showlist的div,下面的li标签
        # print(items)                                     #返回标签对象
        for i in range(len(items)):                        #根据li标签的长度循环次数
            title = hxs.select(''//div[@]/li[%d]//img/@alt'' % i).extract()   #根据循环的次数作为下标获取到当前li标签,下的img标签的alt属性内容
            src = hxs.select(''//div[@]/li[%d]//img/@src'' % i).extract()     #根据循环的次数作为下标获取到当前li标签,下的img标签的src属性内容
            if title and src:
                # print(title[0],src[0])                                                    #通过下标获取到字符串内容
                file_path = os.path.join(os.getcwd() + ''/img/'', title[0] + ''.jpg'')          #拼接图片保存路径
                request.urlretrieve(src[0], file_path)                          #将图片保存到本地,参数1获取到的src,参数2保存路径

image

xpath()标签选择器,是Selector类里的一个方法,参数是选择规则【推荐】

选择器规则同上

selector()创建选择器类,需要接受html对象
需要导入:from scrapy.selector import Selector

# -*- coding: utf-8 -*-
import scrapy       #导入爬虫模块
from scrapy.selector import HtmlXPathSelector  #导入HtmlXPathSelector模块
from scrapy.selector import Selector

class AdcSpider(scrapy.Spider):
    name = ''adc''                                        #设置爬虫名称
    allowed_domains = [''www.shaimn.com'']
    start_urls = [''http://www.shaimn.com/xinggan/'']

    def parse(self, response):
        items = Selector(response=response).xpath(''//div[@]/li'').extract()
        # print(items)                                     #返回标签对象
        for i in range(len(items)):
            title = Selector(response=response).xpath(''//div[@]/li[%d]//img/@alt'' % i).extract()
            src = Selector(response=response).xpath(''//div[@]/li[%d]//img/@src'' % i).extract()
            print(title,src)

正则表达式的应用

正则表达式是弥补,选择器规则无法满足过滤情况时使用的,

分为两种正则使用方式

  1、将选择器规则过滤出来的结果进行正则匹配

  2、在选择器规则里应用正则进行过滤

1、将选择器规则过滤出来的结果进行正则匹配,用正则取最终内容

最后.re(''正则'')

# -*- coding: utf-8 -*-
import scrapy       #导入爬虫模块
from scrapy.selector import HtmlXPathSelector  #导入HtmlXPathSelector模块
from scrapy.selector import Selector

class AdcSpider(scrapy.Spider):
    name = ''adc''                                        #设置爬虫名称
    allowed_domains = [''www.shaimn.com'']
    start_urls = [''http://www.shaimn.com/xinggan/'']

    def parse(self, response):
        items = Selector(response=response).xpath(''//div[@]/li//img'')[0].extract()
        print(items)                                     #返回标签对象
        items2 = Selector(response=response).xpath(''//div[@]/li//img'')[0].re(''alt="(\w+)'')
        print(items2)

# <img src="http://www.shaimn.com/uploads/170724/1-1FH4221056141.jpg" alt="人体艺术mmSunny前凸后翘性感诱惑写真">
# [''人体艺术mmSunny前凸后翘性感诱惑写真'']

2、在选择器规则里应用正则进行过滤

[re:正则规则]

# -*- coding: utf-8 -*-
import scrapy       #导入爬虫模块
from scrapy.selector import HtmlXPathSelector  #导入HtmlXPathSelector模块
from scrapy.selector import Selector

class AdcSpider(scrapy.Spider):
    name = ''adc''                                        #设置爬虫名称
    allowed_domains = [''www.shaimn.com'']
    start_urls = [''http://www.shaimn.com/xinggan/'']

    def parse(self, response):
        items = Selector(response=response).xpath(''//div'').extract()
        # print(items)                                     #返回标签对象
        items2 = Selector(response=response).xpath(''//div[re:test(@class, "showlist")]'').extract()  #正则找到div的class等于showlist的元素
        print(items2)

【转载自:http://www.leiqiankun.com/?id=47】

angular.js+node.js实现下载图片处理详解

angular.js+node.js实现下载图片处理详解

前言

本文主要介绍的是angular.js+node.js实现下载图片处理,下载有两种方式,下面话不多说,来看看详细的介绍吧。

第一种:

不指定完整路径,然后发送get给server让server自己去拼接路径,然后用express的res.download来做下载:

Express:

rush:js;"> var filePath = path.join(savePath,file[0].name); console.log('Download file: ' + filePath); res.download(filePath);

angular:

rush:js;"> $http.get(url).success(function (data) { var bin = new $window.Blob([data]);

deferred.resolve(data);

// Using file-saver library to handle saving work.
saveAs(bin,toFilename);
})

这种适合于server和用户之间仅仅发送文件名,然后浏览器端构造一个restapi 接口 比如/api/download/xxxxx.png,server自己去找到完整的路径然后发送给用户。

第二种方式:

是不写server代码,而是通过express的static静态文件机制,来发送文件给用户

Express:

rush:js;"> app.use('/ocr/uploads',express.static('/data/ocr_img/dev',{ maxAge: 86400000 }));

Angular:

rush:js;"> $http.get(url,{responseType: 'arraybuffer'}).success(function (data) {

var bin = new Blob([data],{ "type" : "image/png" });

deferred.resolve({status: '200'});

saveAs(bin,toFilename);
})

这种适合用户知道server开启静态文件了,那么需要构造完整的相对路径,比如当前网页的url是/orc,那么,只要构造url为uploads/xxx.png,那么express会有转到/data/ocr_img/dev/xxx.png,把文件发送回来。

这里要注意:就是图片发送回来的时候,server默认是使用的text/plain方式,而图片需要的是二进制。因此设置{responseType: 'arraybuffer'} ,同时在收到blob数据的时候指定type为new Blob([data],{ "type" : "image/png" } ,这种type也适用于其他图片类型比如pdf jpg bmp tiff等。

图片下载其实就是二进制文件的下载了,具体参考MDN的一个权威文档:

再扩展就是这个文档了:

总结

以上就是这篇文章的全部内容了,希望本文的内容对大家的学习或者工作能带来一定的帮助,如果有疑问大家可以留言交流,谢谢大家对小编的支持。

Haproxy官方文档翻译(第三章)全局参数(1) 附英文原文

Haproxy官方文档翻译(第三章)全局参数(1) 附英文原文

3.全局参数

在global这个节点里的参数是“进程范围的”并且经常是“操作系统指定”的。它们通常是一次性设置而且一旦正确设置不需要动来动去的。它们中的
一些和命令行对应。

global节点支持以下关键词:

* 进程管理和安全
- ca-base
- chroot
- crt-base
- cpu-map
- daemon
- description
- deviceatlas-json-file
- deviceatlas-log-level
- deviceatlas-separator
- deviceatlas-properties-cookie
- external-check
- gid
- group
- hard-stop-after
- log
- log-tag
- log-send-hostname
- lua-load
- nbproc
- nbthread
- node
- pidfile
- presetenv
- resetenv
- uid
- ulimit-n
- user
- setenv
- stats
- ssl-default-bind-ciphers
- ssl-default-bind-ciphersuites
- ssl-default-bind-options
- ssl-default-server-ciphers
- ssl-default-server-ciphersuites
- ssl-default-server-options
- ssl-dh-param-file
- ssl-server-verify
- unix-bind
- unsetenv
- 51degrees-data-file
- 51degrees-property-name-list
- 51degrees-property-separator
- 51degrees-cache-size
- wurfl-data-file
- wurfl-information-list
- wurfl-information-list-separator
- wurfl-engine-mode
- wurfl-cache-size
- wurfl-useragent-priority

* 性能调节

- max-spread-checks
- maxconn
- maxconnrate
- maxcomprate
- maxcompcpuusage
- maxpipes
- maxsessrate
- maxsslconn
- maxsslrate
- maxzlibmem
- noepoll
- nokqueue
- nopoll
- nosplice
- nogetaddrinfo
- noreuseport
- profiling.tasks
- spread-checks
- server-state-base
- server-state-file
- ssl-engine
- ssl-mode-async
- tune.buffers.limit
- tune.buffers.reserve
- tune.bufsize
- tune.chksize
- tune.comp.maxlevel
- tune.h2.header-table-size
- tune.h2.initial-window-size
- tune.h2.max-concurrent-streams
- tune.http.cookielen
- tune.http.logurilen
- tune.http.maxhdr
- tune.idletimer
- tune.lua.forced-yield
- tune.lua.maxmem
- tune.lua.session-timeout
- tune.lua.task-timeout
- tune.lua.service-timeout
- tune.maxaccept
- tune.maxpollevents
- tune.maxrewrite
- tune.pattern.cache-size
- tune.pipesize
- tune.rcvbuf.client
- tune.rcvbuf.server
- tune.recv_enough
- tune.runqueue-depth
- tune.sndbuf.client
- tune.sndbuf.server
- tune.ssl.cachesize
- tune.ssl.lifetime
- tune.ssl.force-private-cache
- tune.ssl.maxrecord
- tune.ssl.default-dh-param
- tune.ssl.ssl-ctx-cache-size
- tune.ssl.capture-cipherlist-size
- tune.vars.global-max-size
- tune.vars.proc-max-size
- tune.vars.reqres-max-size
- tune.vars.sess-max-size
- tune.vars.txn-max-size
- tune.zlib.memlevel
- tune.zlib.windowsize

* 排错

- debug
- quiet

3.1 进程管理和安全

ca-base <dir>

当直接用“ca-file“表示ssL ca证书路径,“crl-file”关联crl路径,这个参数用来指定一个用来获取SSL CA证书和CRL(证书吊销列表)
的默认路径。绝对路径通常被指定在”ca-file“和“ctl-file”中,并且忽略"ca-base".

chroot <jail dir>

把当前目录切换到指定目录,并且在切换之前会抛弃所有的权限。这样做会增加安全等级以防止位置的漏洞被侦测。这样攻击者就很难
威胁到整个系统。这个选项只有用超级管理员权限启动进程的时候才有效。一定要确保你要切换的目录<jail_dir>是空的,并且任何用户没有写的权限。

cpu-map [auto:]<process-set>[/<thread-set>] <cpu-set>...

在Linux 2.6内核及以上版本,可以绑定一个进程或者线程到指定的CPU上。这意味着被指定的进程或者线程永远不会在指定以外的CPU上运行。“cpu-map”直接
指定CPU给指定的进程或线程用。第一个参数是一个进程,之后跟着一个线程。格式如下:

all | odd | even | number[-[number]]

<number> 必须是1到32者64中的一个,这取决于你的机器字节大小。任何在nbproc之上的进程ID和任何在nbthread之上的
线程ID都是被忽略的。可以用两个数字中间加("-")来指定一个范围。也可以用“all”指定所有的进程。只有奇数数字用“odd”
或者偶数数字用"even",就像用“bind-process”指令。第二个参数是CPU设置。每个CPU设置是一个介于0到31或者0到63或者两
个数字用“-”连接的唯一标识。你为每个cpu设置了标识,就可以绑定进程和线程了。显而易见,如果你想这样,你得设置
多个“cpu-map”指令。每个指令会覆盖之前与它发生冲突的指令。一个线程将绑定在它的映射和它附属的进程之一。如果线程没有
被映射而且它的进程也没有被映射,那么这个线程则不会被绑定。


我们可以定义部分范围。大的那个数字可以被省略。如果这样的话,大的那个数字就会被相应的最大数字替代,比如32或者64.这取决于你的机器字节大小。

你可以前面加上前缀“auto:”,这样可以在增加新的CPU或者新的进程、线程的时候让Haproxy自动绑定。为了确保设置有效,两个设置要有同样的size。
不管定义的CPU的顺序,它总是从下至上搜寻。把“auto:”前缀同时加到进程和线程的范围前面是不支持的。只有一个范围被支持,其他一个必须是精确的数字。

示例:
cpu-map 1-4 0-3 # 绑定标识为1到4的进程到前4个cpu

cpu-map 1/all 0-3 # 绑定第一个进程的所有线程到前4个CPU

cpu-map 1- 0- # 将会被替换成"cpu-map 1-64 0-63"
# 或者"cpu-map 1-32 0-31"这取决于你的机器字节大小

# 所有这些行绑定进程1到cpu0,进程2到cpu1,以此类推。
cpu-map auto:1-4 0-3
cpu-map auto:1-4 0-1 2-3
cpu-map auto:1-4 3 2 1 0

# 所有这些行绑定线程1到cpu0,线程2到cpu1,以此类推
cpu-map auto:1/1-4 0-3
cpu-map auto:1/1-4 0-1 2-3
cpu-map auto:1/1-4 3 2 1 0

# 使用all/odd/even关键词绑定每个进程到精确到cpu上
cpu-map auto:all 0-63
cpu-map auto:even 0-31
cpu-map auto:odd 32-63

# 无效的cpu-map设置,因为进程和cpu配置没有同样的数量
cpu-map auto:1-4 0 # invalid
cpu-map auto:1 0-3 # invalid

# 无效的cpu-map设置,因为自动绑定作用在了进程范围上
# and a thread range.
cpu-map auto:all/all 0 # invalid
cpu-map auto:all/1-4 0 # invalid
cpu-map auto:1-4/all 0 # invalid

crt-base <dir>

当用“crtfile”指令时,指定一个默认目录用来获取从这个指令SSL证书。在"crtfile"指令之后指定绝对路径会覆盖"crtfile"设置
并且忽略“crt-base”。

daemon

可以让进程在后台挂起。这种操作是被推荐的。相当于在命令行中用“-D”参数。也可以用“-db”来禁用。这个选项在systemd模式无效。

deviceatlas-json-file <path>

设置通过API加载的DeviceAtlas json数据的路径。这个路径必须是一个有效的json数据文件并且能被
HAProxy进程访问。

deviceatlas-log-level <value>

设置API返回信息的等级。这个指令是可选的如果不设置默认为0.

deviceatlas-separator <char>

设置API属性结果的字符分隔符。这个指令是可选的如果不设置默认为|。

deviceatlas-properties-cookie <name>

设置客户端的cooke名字,它是用来侦测在请求期间DeviceAtlas 客户端组件是否被使用。这个指令是可选的
如果不设置默认为DAPROPS.

external-check

允许使用外部代理来进行健康检查。这个指令由于安全原因默认被禁止的。

gid <number>

修改进程的group ID 为指定数字。推荐group id使用HAProxy的专用id或者一个类似的小的守护进程设置。
HAProxy必须用一个属于这个组的用户或者拥有超级用户权限的用户启动。注意,如果haproxy从一个
拥有额外组的用户启动了,那么如果从一个超级用户启动它只能丢弃这些额外组的权限。
你还可以参考“group”和“uid”。

hard-stop-after <time>

定义了用来处理一个清除软停止(clean soft-stop)所能执行的最大时间。

讨论:
<time> 是soft-stop在收到SIGUSR1信号后一个应用所能存活的最长时间(默认毫秒为单位)。

这可能是用来确保就算应用在软停止(soft-stop)期间,就算连接还在保持打开状态,应用依然会
被关闭。(比如tcp代理模式中的long timeouts)此设置TCP和HTTP模式都有效。

示例:
global
hard-stop-after 30s

group <group name>

类似于"gid",但是用/etc/group 中的group name来替代GID。可以参考gid和user指令。

未完待续,这章比较长,要分很多篇来完成。

------------------------------以下是英文原文-------------------------------

3. Global parameters

Parameters in the "global" section are process-wide and often OS-specific. They
are generally set once for all and do not need being changed once correct. Some
of them have command-line equivalents.

The following keywords are supported in the "global" section :

 * Process management and security
   - ca-base
   - chroot
   - crt-base
   - cpu-map
   - daemon
   - description
   - deviceatlas-json-file
   - deviceatlas-log-level
   - deviceatlas-separator
   - deviceatlas-properties-cookie
   - external-check
   - gid
   - group
   - hard-stop-after
   - log
   - log-tag
   - log-send-hostname
   - lua-load
   - nbproc
   - nbthread
   - node
   - pidfile
   - presetenv
   - resetenv
   - uid
   - ulimit-n
   - user
   - setenv
   - stats
   - ssl-default-bind-ciphers
   - ssl-default-bind-ciphersuites
   - ssl-default-bind-options
   - ssl-default-server-ciphers
   - ssl-default-server-ciphersuites
   - ssl-default-server-options
   - ssl-dh-param-file
   - ssl-server-verify
   - unix-bind
   - unsetenv
   - 51degrees-data-file
   - 51degrees-property-name-list
   - 51degrees-property-separator
   - 51degrees-cache-size
   - wurfl-data-file
   - wurfl-information-list
   - wurfl-information-list-separator
   - wurfl-engine-mode
   - wurfl-cache-size
   - wurfl-useragent-priority

 * Performance tuning
   - max-spread-checks
   - maxconn
   - maxconnrate
   - maxcomprate
   - maxcompcpuusage
   - maxpipes
   - maxsessrate
   - maxsslconn
   - maxsslrate
   - maxzlibmem
   - noepoll
   - nokqueue
   - nopoll
   - nosplice
   - nogetaddrinfo
   - noreuseport
   - profiling.tasks
   - spread-checks
   - server-state-base
   - server-state-file
   - ssl-engine
   - ssl-mode-async
   - tune.buffers.limit
   - tune.buffers.reserve
   - tune.bufsize
   - tune.chksize
   - tune.comp.maxlevel
   - tune.h2.header-table-size
   - tune.h2.initial-window-size
   - tune.h2.max-concurrent-streams
   - tune.http.cookielen
   - tune.http.logurilen
   - tune.http.maxhdr
   - tune.idletimer
   - tune.lua.forced-yield
   - tune.lua.maxmem
   - tune.lua.session-timeout
   - tune.lua.task-timeout
   - tune.lua.service-timeout
   - tune.maxaccept
   - tune.maxpollevents
   - tune.maxrewrite
   - tune.pattern.cache-size
   - tune.pipesize
   - tune.rcvbuf.client
   - tune.rcvbuf.server
   - tune.recv_enough
   - tune.runqueue-depth
   - tune.sndbuf.client
   - tune.sndbuf.server
   - tune.ssl.cachesize
   - tune.ssl.lifetime
   - tune.ssl.force-private-cache
   - tune.ssl.maxrecord
   - tune.ssl.default-dh-param
   - tune.ssl.ssl-ctx-cache-size
   - tune.ssl.capture-cipherlist-size
   - tune.vars.global-max-size
   - tune.vars.proc-max-size
   - tune.vars.reqres-max-size
   - tune.vars.sess-max-size
   - tune.vars.txn-max-size
   - tune.zlib.memlevel
   - tune.zlib.windowsize

 * Debugging
   - debug
   - quiet

3.1. Process management and security

ca-base <dir>
Assigns a default directory to fetch SSL CA certificates and CRLs from when a
relative path is used with "ca-file" or "crl-file" directives. Absolute
locations specified in "ca-file" and "crl-file" prevail and ignore "ca-base".
chroot <jail dir>
Changes current directory to <jail dir> and performs a chroot() there before
dropping privileges. This increases the security level in case an unknown
vulnerability would be exploited, since it would make it very hard for the
attacker to exploit the system. This only works when the process is started
with superuser privileges. It is important to ensure that <jail_dir> is both
empty and non-writable to anyone.
cpu-map [auto:]<process-set>[/<thread-set>] <cpu-set>...
On Linux 2.6 and above, it is possible to bind a process or a thread to a
specific CPU set. This means that the process or the thread will never run on
other CPUs. The "cpu-map" directive specifies CPU sets for process or thread
sets. The first argument is a process set, eventually followed by a thread
set. These sets have the format

    all | odd | even | number[-[number]]

<number>> must be a number between 1 and 32 or 64, depending on the machine''s
word size. Any process IDs above nbproc and any thread IDs above nbthread are
ignored. It is possible to specify a range with two such number delimited by
a dash (''-''). It also is possible to specify all processes at once using
"all", only odd numbers using "odd" or even numbers using "even", just like
with the "bind-process" directive. The second and forthcoming arguments are
CPU sets. Each CPU set is either a unique number between 0 and 31 or 63 or a
range with two such numbers delimited by a dash (''-''). Multiple CPU numbers
or ranges may be specified, and the processes or threads will be allowed to
bind to all of them. Obviously, multiple "cpu-map" directives may be
specified. Each "cpu-map" directive will replace the previous ones when they
overlap. A thread will be bound on the intersection of its mapping and the
one of the process on which it is attached. If the intersection is null, no
specific binding will be set for the thread.

Ranges can be partially defined. The higher bound can be omitted. In such
case, it is replaced by the corresponding maximum value, 32 or 64 depending
on the machine''s word size.

The prefix "auto:" can be added before the process set to let HAProxy
automatically bind a process or a thread to a CPU by incrementing
process/thread and CPU sets. To be valid, both sets must have the same
size. No matter the declaration order of the CPU sets, it will be bound from
the lowest to the highest bound. Having a process and a thread range with the
"auto:" prefix is not supported. Only one range is supported, the other one
must be a fixed number.
Examples:
cpu-map 1-4 0-3   # bind processes 1 to 4 on the first 4 CPUs

cpu-map 1/all 0-3 # bind all threads of the first process on the
                  # first 4 CPUs

cpu-map 1- 0-     # will be replaced by "cpu-map 1-64 0-63"
                  # or "cpu-map 1-32 0-31" depending on the machine''s # word size. # all these lines bind the process 1 to the cpu 0, the process 2 to cpu 1 # and so on. cpu-map auto:1-4 0-3 cpu-map auto:1-4 0-1 2-3 cpu-map auto:1-4 3 2 1 0 # all these lines bind the thread 1 to the cpu 0, the thread 2 to cpu 1 # and so on. cpu-map auto:1/1-4 0-3 cpu-map auto:1/1-4 0-1 2-3 cpu-map auto:1/1-4 3 2 1 0 # bind each process to exactly one CPU using all/odd/even keyword cpu-map auto:all 0-63 cpu-map auto:even 0-31 cpu-map auto:odd 32-63 # invalid cpu-map because process and CPU sets have different sizes. cpu-map auto:1-4 0 # invalid cpu-map auto:1 0-3 # invalid # invalid cpu-map because automatic binding is used with a process range # and a thread range. cpu-map auto:all/all 0 # invalid cpu-map auto:all/1-4 0 # invalid cpu-map auto:1-4/all 0 # invalid 
crt-base <dir>
Assigns a default directory to fetch SSL certificates from when a relative
path is used with "crtfile" directives. Absolute locations specified after
"crtfile" prevail and ignore "crt-base".
daemon
Makes the process fork into background. This is the recommended mode of
operation. It is equivalent to the command line "-D" argument. It can be
disabled by the command line "-db" argument. This option is ignored in
systemd mode.
deviceatlas-json-file <path>
Sets the path of the DeviceAtlas JSON data file to be loaded by the API.
The path must be a valid JSON data file and accessible by HAProxy process.
deviceatlas-log-level <value>
Sets the level of information returned by the API. This directive is
optional and set to 0 by default if not set.
deviceatlas-separator <char>
Sets the character separator for the API properties results. This directive
is optional and set to | by default if not set.
deviceatlas-properties-cookie <name>
Sets the client cookie''s name used for the detection if the DeviceAtlas
Client-side component was used during the request. This directive is optional
and set to DAPROPS by default if not set.
external-check
Allows the use of an external agent to perform health checks.
This is disabled by default as a security precaution.
See "option external-check".
gid <number>
Changes the process'' group ID to <number>. It is recommended that the group
ID is dedicated to HAProxy or to a small set of similar daemons. HAProxy must
be started with a user belonging to this group, or with superuser privileges.
Note that if haproxy is started from a user having supplementary groups, it
will only be able to drop these groups if started with superuser privileges.
See also "group" and "uid".
hard-stop-after <time>
Defines the maximum time allowed to perform a clean soft-stop.
Arguments :
<time>  is the maximum time (by default in milliseconds) for which the
        instance will remain alive when a soft-stop is received via the
        SIGUSR1 signal.
This may be used to ensure that the instance will quit even if connections
remain opened during a soft-stop (for example with long timeouts for a proxy
in tcp mode). It applies both in TCP and HTTP mode.
Example:
global
  hard-stop-after 30s
group <group name>
Similar to "gid" but uses the GID of group name <group name> from /etc/group.
See also "gid" and "user".

 

IOS 图片上传处理 图片压缩 图片处理

IOS 图片上传处理 图片压缩 图片处理

下面是小编 jb51.cc 通过网络收集整理的代码片段。

小编小编现在分享给大家,也给大家做个参考。

提到从摄像头/相册获取图片是面向终端用户的,由用户去浏览并选择图片为程序使用。在这里,我们需要过UIImagePickerController类来和用户交互。

使用UIImagePickerController和用户交互,我们需要实现2个协议<UIImagePickerControllerDelegate,UINavigationControllerDelegate>。

View Code

代码如下复制代码

#pragma mark 从用户相册获取活动图片

- (void)pickImageFromAlbum

{

    imagePicker = [[UIImagePickerController alloc] init];

    imagePicker.delegate =self;

    imagePicker.sourceType = UIImagePickerControllerSourceTypePhotoLibrary;

    imagePicker.modalTransitionStyle = UIModalTransitionStyleCoverVertical;

    imagePicker.allowsEditing =YES;

    

    [self presentModalViewController:imagePicker animated:YES];

}

我们来看看上面的从相册获取图片,我们首先要实例化UIImagePickerController对象,然后设置imagePicker对象为当前对象,设置imagePicker的图片来源为UIImagePickerControllerSourceTypePhotoLibrary,表明当前图片的来源为相册,除此之外还可以设置用户对图片是否可编辑。

View Code

代码如下复制代码

#pragma mark 从摄像头获取活动图片

- (void)pickImageFromCamera

{

    imagePicker = [[UIImagePickerController alloc] init];

    imagePicker.delegate =self;

    imagePicker.sourceType = UIImagePickerControllerSourceTypeCamera;

    imagePicker.modalTransitionStyle = UIModalTransitionStyleCoverVertical;

    imagePicker.allowsEditing =YES;

    

    [self presentModalViewController:imagePicker animated:YES];

}

//打开相机

- (IBAction)touch_photo:(id)sender {

    // for iphone

    UIImagePickerController *pickerImage = [[UIImagePickerController alloc] init];

   if([UIImagePickerController isSourceTypeAvailable:UIImagePickerControllerSourceTypeCamera]) {

        pickerImage.sourceType = UIImagePickerControllerSourceTypeCamera;

        pickerImage.mediaTypes = [UIImagePickerController availableMediaTypesForSourceType:pickerImage.sourceType];

        

    }

    pickerImage.delegate =self;

    pickerImage.allowsEditing =YES;//自定义照片样式

    [self presentViewController:pickerImage animated:YES completion:nil];

}

以上是从摄像头获取图片,和从相册获取图片只是图片来源的设置不一样,摄像头图片的来源为UIImagePickerControllerSourceTypeCamera。

    在和用户交互之后,用户选择好图片后,会回调选择结束的方法。

-(void)imagePickerController:(UIImagePickerController*)picker didFinishPickingMediawithInfo:(NSDictionary*)info

{

    //初始化imageNew为从相机中获得的--

    UIImage *imageNew = [info objectForKey:@"UIImagePickerControllerOriginalImage"];

    //设置image的尺寸

    CGSize imagesize = imageNew.size;

    imagesize.height =626;

    imagesize.width =413;

    //对图片大小进行压缩--

    imageNew = [self imageWithImage:imageNew scaledToSize:imagesize];

    NSData *imageData = UIImageJPEGRepresentation(imageNew,0.00001);

   if(m_selectimage==nil)

    {

        m_selectimage = [UIImage imageWithData:imageData];

        NSLog(@"m_selectimage:%@",m_selectimage);

        [self.TakePhotoBtn setimage:m_selectimage forState:UIControlStatenormal];

        [picker dismissModalViewControllerAnimated:YES];

       return ;

    }

    [picker release];

}

//对图片尺寸进行压缩--

-(UIImage*)imageWithImage:(UIImage*)image scaledToSize:(CGSize)newSize

{

    // Create a graphics image context

    UIGraphicsBeginImageContext(newSize);

    

    // Tell the old image to draw in this new context,with the desired

    // new size

    [image drawInRect:CGRectMake(0,newSize.width,newSize.height)];

    

    // Get the new image from the context

    UIImage* newImage = UIGraphicsGetimageFromCurrentimageContext();

    

    // End the context

    UIGraphicsEndImageContext();

    

    // Return the new image.

   return newImage;

}

以上是小编(jb51.cc)为你收集整理的全部代码内容,希望文章能够帮你解决所遇到的程序开发问题。

如果觉得小编网站内容还不错,欢迎将小编网站推荐给程序员好友。

关于Scrapy官方文档系列——下载图片及图片处理scrapy 图片下载的介绍已经告一段落,感谢您的耐心阅读,如果想了解更多关于4、web爬虫,scrapy模块标签选择器下载图片,以及正则匹配标签、angular.js+node.js实现下载图片处理详解、Haproxy官方文档翻译(第三章)全局参数(1) 附英文原文、IOS 图片上传处理 图片压缩 图片处理的相关信息,请在本站寻找。

本文标签:

上一篇使用docker compose制作一套简单的CI服务(如何制作docker)

下一篇如何手动安装pytorch whl文件(python手动安装whl)