何时使用 MongoDB 或其他面向文档的数据库系统？（什么时候使用mongodb）

25-03-11 10

在本文中，我们将详细介绍PythonDB-API：fetchone与fetchmany与fetchall的各个方面，并为您提供关于python中fetch的相关解答，同时，我们也将为您带来关于@One

在本文中，我们将详细介绍Python DB-API：fetchone与fetchmany与fetchall的各个方面，并为您提供关于python中fetch的相关解答，同时，我们也将为您带来关于@OneToMany(fetch = FetchType.EAGER), FetchType.EAGER 急加载的导致查询重复明细数据的BUG、Ajax与Fetch、Cache Fetched AJAX Requests Locally: Wrapping the Fetch API、Cache与Fetch（二）的有用知识。

本文目录一览：

Python DB-API：fetchone与fetchmany与fetchall（python中fetch）
@OneToMany(fetch = FetchType.EAGER), FetchType.EAGER 急加载的导致查询重复明细数据的BUG
Ajax与Fetch
Cache Fetched AJAX Requests Locally: Wrapping the Fetch API
Cache与Fetch（二）

Python DB-API：fetchone与fetchmany与fetchall（python中fetch）

我今天刚刚与一些同事讨论了python的db-api fetchone vs fetchmany vs fetchall。

我确定每个应用程序的用例都取决于我正在使用的db-api的实现，但是总的来说，fetchone，fetchmany，fetchall的用例是什么？

换句话说，以下等效项是什么？还是其中之一比其他人更受青睐？如果是这样，在哪些情况下？

cursor.execute("SELECT id, name FROM `table`")for i in xrange(cursor.rowcount):    id, name = cursor.fetchone()    print id, namecursor.execute("SELECT id, name FROM `table`")result = cursor.fetchmany()while result:    for id, name in result:        print id, name    result = cursor.fetchmany()cursor.execute("SELECT id, name FROM `table`")for id, name in cursor.fetchall():    print id, name

答案1

小编典典

我认为这确实取决于实现，但是您可以通过查看MySQLdb源代码来了解差异。根据选项的不同，mysqldb fetch
*将当前行集保留在内存或服务器端，因此fetchmany vs
fetchone在此处具有一定的灵活性，可以知道要在（python）的内存中保留什么以及在数据库服务器端保留什么。

PEP 249并没有提供太多细节，所以我想这是根据数据库来实现的，而确切的语义是由实现定义的。

@OneToMany(fetch = FetchType.EAGER), FetchType.EAGER 急加载的导致查询重复明细数据的BUG

项目使用spring data jpa ,

在配置了多对1 和 1对多的表对象关联关系之后，

同事发现查询出来的多的一方，有两个数据，而且他们的数据id 都是一样的。

开始他们以为是阿里的json 序列化的时候，因为有重复引用，循环引用问题，导致了这个 BUG。

但是他设置了 fastjson 禁止重复引用，循环引用，都不能解决。

刚好我注意到了，就去了解一下这个BUG。开始也以为是阿里fastjson 的BUG，

但是在我 debug 查看数据的时候，发现明细表就是有两条一样的数据的。

那就是他们配置关系的时候配置错了。


关联对象代码：

一方：

 @OneToMany(fetch = FetchType.EAGER, mappedBy = "trade")
    private List<TcOrder> orderList = new ArrayList<>();



多方：


 /**
     * 父订单ID
     */
    @Column(name = "trade_id", nullable = true, length = 19)
    private Long tradeId;


  @ManyToOne()
    @JoinColumn(name = "trade_id", unique = true, insertable = false, updatable = false)
    @JSONField(serialize = false)
    private TcTrade trade;



这配置 查询的时候的SQL
 select
        tctrade0_.id as id1_158_0_,
        tctrade0_.account_status as account_2_158_0_,
        tctrade0_.addres_id as addres_i3_158_0_,
        tctrade0_.biz_type as biz_type4_158_0_,
        tctrade0_.buyer_id as buyer_id5_158_0_,
        tctrade0_.buyer_name as buyer_na6_158_0_,
        tctrade0_.buyer_org_id as buyer_or7_158_0_,
        tctrade0_.buyer_type as buyer_ty8_158_0_,
        tctrade0_.cancel_reason as cancel_r9_158_0_,
        tctrade0_.cancel_time as cancel_10_158_0_,
        tctrade0_.cancel_type as cancel_11_158_0_,
        tctrade0_.channel as channel12_158_0_,
        tctrade0_.create_time as create_13_158_0_,
        tctrade0_.customer_id as custome14_158_0_,
        tctrade0_.customer_name as custome15_158_0_,
        tctrade0_.customer_nick_name as custome16_158_0_,
        tctrade0_.dealer_id as dealer_17_158_0_,
        tctrade0_.dealer_name as dealer_18_158_0_,
        tctrade0_.delivery_amount as deliver19_158_0_,
        tctrade0_.delivery_status as deliver20_158_0_,
        tctrade0_.delivery_way as deliver21_158_0_,
        tctrade0_.discount_amount as discoun22_158_0_,
        tctrade0_.fans_id as fans_id23_158_0_,
        tctrade0_.fee_item_amount as fee_ite24_158_0_,
        tctrade0_.follower_id as followe25_158_0_,
        tctrade0_.follower_org_id as followe26_158_0_,
        tctrade0_.goods_amount as goods_a27_158_0_,
        tctrade0_.have_invoice as have_in28_158_0_,
        tctrade0_.hedge_type as hedge_t29_158_0_,
        tctrade0_.income_amount as income_30_158_0_,
        tctrade0_.invoice_id as invoice31_158_0_,
        tctrade0_.invoice_no as invoice32_158_0_,
        tctrade0_.min_send_amount as min_sen33_158_0_,
        tctrade0_.modify_time as modify_34_158_0_,
        tctrade0_.order_type as order_t35_158_0_,
        tctrade0_.org_id as org_id36_158_0_,
        tctrade0_.pay_amount as pay_amo37_158_0_,
        tctrade0_.pay_status as pay_sta38_158_0_,
        tctrade0_.pay_time as pay_tim39_158_0_,
        tctrade0_.payment_id as payment40_158_0_,
        tctrade0_.payment_no as payment41_158_0_,
        tctrade0_.promoter_id as promote42_158_0_,
        tctrade0_.promoter_name as promote43_158_0_,
        tctrade0_.promoter_org_id as promote44_158_0_,
        tctrade0_.promoter_type as promote45_158_0_,
        tctrade0_.refunded_amount as refunde46_158_0_,
        tctrade0_.remark as remark47_158_0_,
        tctrade0_.remind_delivery as remind_48_158_0_,
        tctrade0_.removed as removed49_158_0_,
        tctrade0_.require_deliver_date as require50_158_0_,
        tctrade0_.seller_id as seller_51_158_0_,
        tctrade0_.seller_name as seller_52_158_0_,
        tctrade0_.seller_org_id as seller_53_158_0_,
        tctrade0_.seller_type as seller_54_158_0_,
        tctrade0_.source_id as source_55_158_0_,
        tctrade0_.source_no as source_56_158_0_,
        tctrade0_.staff_id as staff_i57_158_0_,
        tctrade0_.staff_name as staff_n58_158_0_,
        tctrade0_.store_id as store_i59_158_0_,
        tctrade0_.store_name as store_n60_158_0_,
        tctrade0_.total_amount as total_a61_158_0_,
        tctrade0_.trade_img as trade_i62_158_0_,
        tctrade0_.trade_no as trade_n63_158_0_,
        tctrade0_.trade_status as trade_s64_158_0_,
        tctrade0_.uuid as uuid65_158_0_,
        orderlist1_.trade_id as trade_i29_146_1_,
        orderlist1_.id as id1_146_1_,
        orderlist1_.id as id1_146_2_,
        orderlist1_.brand_id as brand_id2_146_2_,
        orderlist1_.brand_name as brand_na3_146_2_,
        orderlist1_.create_time as create_t4_146_2_,
        orderlist1_.customer_id as customer5_146_2_,
        orderlist1_.customer_name as customer6_146_2_,
        orderlist1_.customer_nick_name as customer7_146_2_,
        orderlist1_.dealer_id as dealer_i8_146_2_,
        orderlist1_.dealer_name as dealer_n9_146_2_,
        orderlist1_.delivery_amount as deliver10_146_2_,
        orderlist1_.discount_amount as discoun11_146_2_,
        orderlist1_.express_code as express12_146_2_,
        orderlist1_.express_name as express13_146_2_,
        orderlist1_.express_no as express14_146_2_,
        orderlist1_.fans_id as fans_id15_146_2_,
        orderlist1_.goods_amount as goods_a16_146_2_,
        orderlist1_.income_amount as income_17_146_2_,
        orderlist1_.modify_time as modify_18_146_2_,
        orderlist1_.order_img as order_i19_146_2_,
        orderlist1_.order_no as order_n20_146_2_,
        orderlist1_.order_state as order_s21_146_2_,
        orderlist1_.org_id as org_id22_146_2_,
        orderlist1_.removed as removed23_146_2_,
        orderlist1_.staff_id as staff_i24_146_2_,
        orderlist1_.staff_name as staff_n25_146_2_,
        orderlist1_.store_id as store_i26_146_2_,
        orderlist1_.store_name as store_n27_146_2_,
        orderlist1_.total_amount as total_a28_146_2_,
        orderlist1_.trade_id as trade_i29_146_2_,
        itemlist2_.order_id as order_i24_147_3_,
        itemlist2_.id as id1_147_3_,
        itemlist2_.id as id1_147_4_,
        itemlist2_.actual_price as actual_p2_147_4_,
        itemlist2_.appraise_id as appraise3_147_4_,
        itemlist2_.category_id as category4_147_4_,
        itemlist2_.category_name as category5_147_4_,
        itemlist2_.create_time as create_t6_147_4_,
        itemlist2_.dealer_id as dealer_i7_147_4_,
        itemlist2_.dealer_name as dealer_n8_147_4_,
        itemlist2_.deliver_num as deliver_9_147_4_,
        itemlist2_.deliver_status as deliver10_147_4_,
        itemlist2_.deliver_time as deliver11_147_4_,
        itemlist2_.discount_price as discoun12_147_4_,
        itemlist2_.express_code as express13_147_4_,
        itemlist2_.express_name as express14_147_4_,
        itemlist2_.express_no as express15_147_4_,
        itemlist2_.goods_id as goods_i16_147_4_,
        itemlist2_.goods_no as goods_n17_147_4_,
        itemlist2_.goods_title as goods_t18_147_4_,
        itemlist2_.item_img as item_im19_147_4_,
        itemlist2_.item_state as item_st20_147_4_,
        itemlist2_.item_type as item_ty21_147_4_,
        itemlist2_.modify_time as modify_22_147_4_,
        itemlist2_.num as num23_147_4_,
        itemlist2_.order_id as order_i24_147_4_,
        itemlist2_.original_price as origina25_147_4_,
        itemlist2_.remark as remark26_147_4_,
        itemlist2_.removed as removed27_147_4_,
        itemlist2_.return_num as return_28_147_4_,
        itemlist2_.sale_price as sale_pr29_147_4_,
        itemlist2_.sale_type as sale_ty30_147_4_,
        itemlist2_.share_price as share_p31_147_4_,
        itemlist2_.shipper_id as shipper32_147_4_,
        itemlist2_.shipper_name as shipper33_147_4_,
        itemlist2_.sku_id as sku_id34_147_4_,
        itemlist2_.sku_no as sku_no35_147_4_,
        itemlist2_.source_id as source_36_147_4_,
        itemlist2_.source_item_id as source_37_147_4_,
        itemlist2_.source_no as source_38_147_4_,
        itemlist2_.source_type as source_39_147_4_,
        itemlist2_.specification as specifi40_147_4_,
        itemlist2_.store_id as store_i41_147_4_,
        itemlist2_.store_name as store_n42_147_4_,
        itemlist2_.trade_id as trade_i43_147_4_ 
    from
        tc_trade tctrade0_ 
    left outer join
        tc_order orderlist1_ 
            on tctrade0_.id=orderlist1_.trade_id 
    left outer join
        tc_order_item itemlist2_ 
            on orderlist1_.id=itemlist2_.order_id 
    where
        tctrade0_.id=175010


有 两个 left join 就是这里导致 出现 重复的明细数据

旧代码，多方的配置方法与我的方法不一样。配置了 @ManyToOne 正常来说，就不要配置 @JoinColumn 了。

这里配置了，如果去掉 @JoinColumn 又会报错，因为配置有 tradeId ，有重复了。这样的话，就可能需要改代码了。

会可能出BUG的。

解决办法

1. 但是也有其他方式去改了，比如查询之后，再查一遍明细 order 数据，重复赋值即可。

2. 改为懒加载就可以解决了，不要使用 EAGER ，也比较影响性能

fetch = FetchType.LAZY

Ajax与Fetch

介绍

页面中需要向服务器请求数据时，基本上都会使用Ajax来实现。Ajax的本质是使用XMLHttpRequest对象来请求数据。XMLHttpRequest的使用如下：

var xhr = new XMLHttpRequest();
xhr.open(''GET'', url, true);
xhr.onload = function() {
  console.log(xhr.response);
};
xhr.onerror = function() {
  console.error(''error'');
};
xhr.send();

可以看出，XMLHttpRequest对象是通过事件的模式来实现返回数据的处理的。目前还有一个是采用Promise方式来处理数据的，这个技术叫做Fetch。

Fetch的使用

使用Fetch实现请求的最基本代码：

fetch(url).then(function (response) {
  return response.json();  // json返回数据
}).then(function (data) {
  console.log(data);  // 业务逻辑
}).catch(function (e) {
  console.error(''error'');
})

使用ES6的箭头函数后，可以更加简洁：

fetch(url).then(response => response.json())
.then(data => console.log(data))
.catch(e => console.error(''error''));

还可以使用ES7的async/await进一步简化代码：

try {
  let response = await fetch(url);
  let data = response.json();
  console.log(data);
} catch(e) {
  console.log(''error'');
}

这样，异步的请求可以按照同步的写法来写了。

Fetch修改head信息

fetch方法中还有第二个参数，第二个参数是用于修改请求的Head信息的。可以在里面指定method是GET还是POST；如果是跨域的话，可以指定mode为cors来解决跨域问题。

var headers = new Headers({
  "Origin": "http://taobao.com"
});
headers.append("Content-Type", "text/plain");

var init = {
  method: ''GET'',
  headers: headers,
  mode: ''cors'',
  cache: ''default''
};

fetch(url, init).then(response => response.json())
.then(data => console.log(data))
.catch(e => console.error(''error''));

Cache Fetched AJAX Requests Locally: Wrapping the Fetch API

This article is by guest author Peter Bengtsson. SitePoint guest posts aim to bring you engaging content from prominent writers and speakers of the JavaScript community

More from this author

Smart Front-ends & Dumb Back-ends: Persisting State in AngularJS
Face Proximity Detection with JavaScript

This article demonstrates how you implement a local cache of fetched requests so that if done repeatedly it reads from session storage instead. The advantage of this is that you don’t need to have custom code for each resource you want cached.

Follow along if you want to look really cool at your next JavaScript dinner party, where you can show off various skills of juggling promises, state-of-the-art APIs and local storage.

The Fetch API

At this point you’re hopefully familiar with fetch. It’s a new native API in browsers to replace the old XMLHttpRequest API.

Where it hasn’t been perfectly implemented in all browsers, you can use GitHub’s fetch polyfill (And if you have nothing to do all day, here’s the Fetch Standard spec).

The Naïve Alternative

Suppose you know exactly which one resource you need to download and only want to download it once. You could use a global variable as your cache, something like this:

let origin = null
fetch(''https://httpbin.org/get'')
  .then(r => r.json())
  .then(information => {
    origin = information.origin  // your client''s IP
  })

// need to delay to make sure the fetch has finished
setTimeout(() => {
  console.log(''Your origin is '' + origin)
}, 3000)

On CodePen

That just relies on a global variable to hold the cached data. The immediate problem is that the cached data goes away if you reload the page or navigate to some new page.

Let’s upgrade our first naive solution before we dissect its shortcomings.

fetch(''https://httpbin.org/get'')
  .then(r => r.json())
  .then(info => {
    sessionStorage.setItem(''information'', JSON.stringify(info))
  })

// need to delay to make sure the fetch has finished
setTimeout(() => {
  let info = JSON.parse(sessionStorage.getItem(''information''))
  console.log(''Your origin is '' + info.origin)
}, 3000)

On CodePen

The first an immediate problem is that fetch is promise-based, meaning we can’t know for sure when it has finished, so to be certain we should not rely on its execution until its promise resolves.

The second problem is that this solution is very specific to a particular URL and a particular piece of cached data (keyinformation in this example). What we want is a generic solution that is based on the URL instead.

First Implementation – Keeping It Simple

Let’s put a wrapper around fetch that also returns a promise. The code that calls it probably doesn’t care if the result came from the network or if it came from the local cache.

So imagine you used to do this:

fetch(''https://httpbin.org/get'')
  .then(r => r.json())
  .then(issues => {
    console.log(''Your origin is '' + info.origin)
  })

On CodePen

And now you want to wrap that, so that repeated network calls can benefit from a local cache. Let’s simply call itcachedFetch instead, so the code looks like this:

cachedFetch(''https://httpbin.org/get'')
  .then(r => r.json())
  .then(info => {
    console.log(''Your origin is '' + info.origin)
  })

The first time that’s run, it needs to resolve the request over the network and store the result in the cache. The second time it should draw directly from the local storage.

Let’s start with the code that simply wraps the fetch function:

const cachedFetch = (url, options) => {
  return fetch(url, options)
}

On CodePen

This works, but is useless, of course. Let’s implement the storing of the fetched data to start with.

const cachedFetch = (url, options) => {
  // Use the URL as the cache key to sessionStorage
  let cacheKey = url
  return fetch(url, options).then(response => {
    // let''s only store in cache if the content-type is
    // JSON or something non-binary
    let ct = response.headers.get(''Content-Type'')
    if (ct && (ct.match(/application\/json/i) || ct.match(/text\//i))) {
      // There is a .json() instead of .text() but
      // we''re going to store it in sessionStorage as
      // string anyway.
      // If we don''t clone the response, it will be
      // consumed by the time it''s returned. This
      // way we''re being un-intrusive.
      response.clone().text().then(content => {
        sessionStorage.setItem(cacheKey, content)
      })
    }
    return response
  })
}

On CodePen

There’s quite a lot going on here.

The first promise returned by fetch actually goes ahead and makes the GET request. If there are problems with CORS (Cross-Origin Resource Sharing) the .text(), .json() or .blob() methods won’t work.

The most interesting feature is that we have to clone the Response object returned by the first promise. If we don’t do that, we’re injecting ourselves too much and when the final user of the promise tries to call .json() (for example) they’ll get this error:

TypeError: Body has already been consumed.

The other thing to notice is the carefulness around what the response type is: we only store the response if the status code is 200 and if the content type is application/json or text/*. This is because sessionStorage can only store text.

Here’s an example of using this:

cachedFetch(''https://httpbin.org/get'')
  .then(r => r.json())
  .then(info => {
    console.log(''Your origin is '' + info.origin)
  })

cachedFetch(''https://httpbin.org/html'')
  .then(r => r.text())
  .then(document => {
    console.log(''Document has '' + document.match(/<p>/).length + '' paragraphs'')
  })

cachedFetch(''https://httpbin.org/image/png'')
  .then(r => r.blob())
  .then(image => {
    console.log(''Image is '' + image.size + '' bytes'')
  })

What’s neat about this solution so far is that it works, without interfering, for both JSON and HTML requests. And when it’s an image, it does not attempt to store that in sessionStorage.

Second Implementation – Actually Return Cache Hits

So our first implementation just takes care of storing the responses of requests. But if you call the cachedFetch a second time it doesn’t yet bother to try to retrieve anything from sessionStorage. What we need to do is return, first of all, a promise and the promise needs to resolve a Response object.

Let’s start with a very basic implementation:

const cachedFetch = (url, options) => {
  // Use the URL as the cache key to sessionStorage
  let cacheKey = url

  // START new cache HIT code
  let cached = sessionStorage.getItem(cacheKey)
  if (cached !== null) {
    // it was in sessionStorage! Yay!
    let response = new Response(new Blob([cached]))
    return Promise.resolve(response)
  }
  // END new cache HIT code

  return fetch(url, options).then(response => {
    // let''s only store in cache if the content-type is
    // JSON or something non-binary
    if (response.status === 200) {
      let ct = response.headers.get(''Content-Type'')
      if (ct && (ct.match(/application\/json/i) || ct.match(/text\//i))) {
        // There is a .json() instead of .text() but
        // we''re going to store it in sessionStorage as
        // string anyway.
        // If we don''t clone the response, it will be
        // consumed by the time it''s returned. This
        // way we''re being un-intrusive.
        response.clone().text().then(content => {
          sessionStorage.setItem(cacheKey, content)
        })
      }
    }
    return response
  })
}

On CodePen

And it just works!

To see it in action, open the CodePen for this code and once you’re there open your browser’s Network tab in the developer tools. Press the “Run” button (top-right-ish corner of CodePen) a couple of times and you should see that only the image is being repeatedly requested over the network.

One thing that is neat about this solution is the lack of “callback spaghetti”. Since the sessionStorage.getItem call is synchronous (aka. blocking), we don’t have to deal with “Was it in the local storage?” inside a promise or callback. And only if there was something there, do we return the cached result. If not, the if statement just carries on to the regular code.

Third Implementation – What About Expiry Times?

So far we’ve been using sessionStorage which is just like localStorage except that the sessionStorage gets wiped clean when you start a new tab. That means we’re riding a “natural way” of not caching things too long. If we were to uselocalStorage instead and cache something, it’d simply get stuck there “forever” even if the remote content has changed. And that’s bad.

A better solution is to give the user control instead. (The user in this case is the web developer using our cachedFetchfunction). Like with storage such as Memcached or Redis on the server side, you set a lifetime specifying how long it should be cached.

For example, in Python (with Flask)

>>> from werkzeug.contrib.cache import MemcachedCache
>>> cache = MemcachedCache([''127.0.0.1:11211''])
>>> cache.set(''key'', ''value'', 10)
True
>>> cache.get(''key'')
''value''
>>> # waiting 10 seconds
...
>>> cache.get(''key'')
>>>

Now, neither sessionStorage nor localStorage has this functionality built-in, so we have to implement it manually. We’ll do that by always taking note of the timestamp at the time of storing and use that to compare on a possible cache hit.

But before we do that, how is this going to look? How about something like this:

// Use a default expiry time, like 5 minutes
cachedFetch(''https://httpbin.org/get'')
  .then(r => r.json())
  .then(info => {
    console.log(''Your origin is '' + info.origin)
  })

// Instead of passing options to `fetch` we pass an integer which is seconds
cachedFetch(''https://httpbin.org/get'', 2 * 60)  // 2 min
  .then(r => r.json())
  .then(info => {
    console.log(''Your origin is '' + info.origin)
  })

// Combined with fetch''s options object but called with a custom name
let init = {
  mode: ''same-origin'',
  seconds: 3 * 60 // 3 minutes
}
cachedFetch(''https://httpbin.org/get'', init)
  .then(r => r.json())
  .then(info => {
    console.log(''Your origin is '' + info.origin)
  })

The crucial new thing we’re going to add is that every time we save the response data, we also record when we stored it. But note that now we can also switch to the braver storage of localStorage instead of sessionStorage. Our custom expiry code will make sure we don’t get horribly stale cache hits in the otherwise persistent localStorage.

So here’s our final working solution:

const cachedFetch = (url, options) => {
  let expiry = 5 * 60 // 5 min default
  if (typeof options === ''number'') {
    expiry = options
    options = undefined
  } else if (typeof options === ''object'') {
    // I hope you didn''t set it to 0 seconds
    expiry = options.seconds || expiry
  }
  // Use the URL as the cache key to sessionStorage
  let cacheKey = url
  let cached = localStorage.getItem(cacheKey)
  let whenCached = localStorage.getItem(cacheKey + '':ts'')
  if (cached !== null && whenCached !== null) {
    // it was in sessionStorage! Yay!
    // Even though ''whenCached'' is a string, this operation
    // works because the minus sign converts the
    // string to an integer and it will work.
    let age = (Date.now() - whenCached) / 1000
    if (age < expiry) {
      let response = new Response(new Blob([cached]))
      return Promise.resolve(response)
    } else {
      // We need to clean up this old key
      localStorage.removeItem(cacheKey)
      localStorage.removeItem(cacheKey + '':ts'')
    }
  }

  return fetch(url, options).then(response => {
    // let''s only store in cache if the content-type is
    // JSON or something non-binary
    if (response.status === 200) {
      let ct = response.headers.get(''Content-Type'')
      if (ct && (ct.match(/application\/json/i) || ct.match(/text\//i))) {
        // There is a .json() instead of .text() but
        // we''re going to store it in sessionStorage as
        // string anyway.
        // If we don''t clone the response, it will be
        // consumed by the time it''s returned. This
        // way we''re being un-intrusive.
        response.clone().text().then(content => {
          localStorage.setItem(cacheKey, content)
          localStorage.setItem(cacheKey+'':ts'', Date.now())
        })
      }
    }
    return response
  })
}

On CodePen

Future Implementation – Better, Fancier, Cooler

Not only are we avoiding hitting those web APIs excessively, the best part is that localStorage is a gazillion times faster than relying on network. See this blog post for a comparison of localStorage versus XHR: localForage vs. XHR. It measures other things but basically concludes that localStorage is really fast and disk-cache warm-ups are rare.

So how could we further improve our solution?

Dealing with binary responses

Our implementation here doesn’t bother caching non-text things, like images, but there’s no reason it can’t. We would need a bit more code. In particular, we probably want to store more information about the Blob. Every response is a Blob basically. For text and JSON it’s just an array of strings. And the type and size doesn’t really matter because it’s something you can figure out from the string itself. For binary content the blob has to be converted to a ArrayBuffer.

For the curious, to see an extension of our implementation that supports images, check out this CodePen.

Using hashed cache keys

Another potential improvement is to trade space for speed by hashing every URL, which was what we used as a key, to something much smaller. In the examples above we’ve been using just a handful of really small and neat URLs (e.g.https://httpbin.org/get) but if you have really large URLs with lots of query string thingies and you have lots of them, it can really add up.

A solution to this is to use this neat algorithm which is known to be safe and fast:

const hashstr = s => {
  let hash = 0;
  if (s.length == 0) return hash;
  for (let i = 0; i < s.length; i++) {
    let char = s.charCodeAt(i);
    hash = ((hash<<5)-hash)+char;
    hash = hash & hash; // Convert to 32bit integer
  }
  return hash;
}

If you like this, check out this CodePen. If you inspect the storage in your web console you’ll see keys like 557027443.

Conclusion

You now have a working solution you can stick into your web apps, where perhaps you’re consuming a web API and you know the responses can be pretty well cached for your users.

One last thing that might be a natural extension of this prototype is to take it beyond an article and into a real, concrete project, with tests and a README, and publish it on npm – but that’s for another time!

Cache与Fetch（二）

这两天一直百思不得其解的问题终于解决了，这个问题如下：

通过HQL：“select distinct forumGroup from ForumGroup as forumGroup left join fetch forumGroup.forums”查询所有ForumGroup，并将它们的Forum一并抓取出来。查询启用了查询缓存，ForumGroup和Forum都被映射为可缓存的。

第一次执行时，自然不会命中，生成了如下SQL：

    /* select
        distinct forumGroup
    from
        ForumGroup as forumGroup
    left join
        fetch forumGroup.forums */ select
            distinct forumgroup0_.id as id3_0_,
            forums1_.id as id2_1_,
            forumgroup0_.creationTime as creation2_3_0_,
            forumgroup0_.description as descript3_3_0_,
            forumgroup0_.modifiedTime as modified4_3_0_,
            forumgroup0_.name as name3_0_,
            forums1_.creationTime as creation2_2_1_,
            forums1_.description as descript3_2_1_,
            forums1_.groupId as groupId2_1_,
            forums1_.modifiedTime as modified4_2_1_,
            forums1_.name as name2_1_,
            forums1_.groupId as groupId0__,
            forums1_.id as id0__
        from
            ForumGroup forumgroup0_
        left outer join
            Forum forums1_
                on forumgroup0_.id=forums1_.groupId

第二次执行时，按理是应该命中，不会有任何SQL生成，但是实际结果却是：

DEBUG [13178395@qtp-12191562-0] StandardQueryCache.get(142) | returning cached query results

   /* load one-to-many oobbs.domainmodel.forum.ForumGroup.forums */ select
        forums0_.groupId as groupId1_,
        forums0_.id as id1_,
        forums0_.id as id2_0_,
        forums0_.creationTime as creation2_2_0_,
        forums0_.description as descript3_2_0_,
        forums0_.groupId as groupId2_0_,
        forums0_.modifiedTime as modified4_2_0_,
        forums0_.name as name2_0_
    from
        Forum forums0_
    where
        forums0_.groupId=?

重复出现上述SQL.

分析上面的日志我发现：1.查询已经命中，这一点是确认的。2.重复出现的SQL是在迭代ForumGroup中试图访问它的Fourm集合时生成的，这是一个典型的N+1次查询问题。

这里让我迷惑的是：Forum被标记为了可缓存，明明是被Fetch出来的，在第一次查询时它们就被加载出来并进入到二级缓存了，为什么在第二次第三次查询时却找不到这些对象，还要重新查数据库呢？经过排查发现，在ForumGroup的Forum集合字段上没有将该集合配制为可缓存。这样，虽然第一次这些Forum都被抓取出来并进入了二级缓存，但是ForumGroup对象的forums集合（一个存放Forum的ID的集合，不是Forum对象的集合）在上次查询时就没有进入二级缓存，现在，这些集合没有保存forum的ID（在第一次查询的结果集中是有这些ID的，但是因为这个forums集合没有配为可缓存的，所以在ForumGroup对象进入二级缓存时，这些forum集合的信息就被舍弃了）。这样，下一次查询时，虽然查询命中，但是查询结果中的ForumGroup的fourms集合是空的，因而会重新生成SQL查询。

另外，通过Debug，我发现，Query Cache缓存一个查询，其key就不多说了，在log中都会打出，其Value很有意思。上面的这个查询这的value是[5228208135548928, 1, 1, 2]，第一个数据是一个时间戳，第二，三，四就是ForumGroup的ID。因为做过表连接，所以ForumGroup_1重复出现了一次。Forum的ID并没有在缓存的value中。由此可以确定，Query Cache只缓存结果集中对象的ID。被Fetch出来的Forum不被视为结果集的一部分，因而没有出现在结果集中。

还会有一些相似的问题可能会出现，比如：

相同条件第一次 list的时候，因为查询缓存中找不到，不管class缓存是否存在数据，总是发送一条sql语句到数据库获取全部数据，然后填充查询缓存和class缓存。但是第二次执行的时候，问题就来了，如果你的class缓存的超时时间比较短，现在class缓存都超时了，但是查询缓存还在，那么list方法在获取id串以后，将会一个一个去数据库load（ N+1次查询问题）！因此，class缓存的超时时间一定不能短于查询缓存设置的超时时间！如果还设置了发呆时间的话，保证class缓存的发呆时间也大于查询缓存的生存时间。这里还有其他情况，比如class缓存被程序强制evict了，这种情况就请自己注意了。如果以上问题没有处理好，就必然会出现N+1次查询问题。

通过这个问题，总结两点：

1.查询缓存只保存储查询结果集中对象的ID，对象的实例存放在二级缓存中。这种缓存格局可能有两种不一致的情况发生：一.二级缓存中有实体对象，但是查询缓存中没有缓存某些关联对象的ID，这时个会导致重新生成SQL查询数据库。比如上面提到的Forums集合问题。二.查询缓存中有对象的ID，但是这些对象的实例却不在二级缓存中了。比如二级缓存已经失效等等，这也会导致生成SQL查询数据库。

2.对于集合字段，必须显式地使用@Cache标记，集合才会被缓存。注意，这里缓存的并不是集合的元素，而是元素的ID。集合元素能否被缓存取决于元素类有没有声明为可缓存的。如果没有配制集合为可缓存，那么，即使在第一次查询时它们都进行入了二级缓存，下一次通过宿主类导航这个集合时还是会生成SQL，因为集合没有缓存，也就是所有元素的ID没有缓存，Hibernate不知道宿主对象关联的是那一些元素。虽然这些元素都已经在缓存中了。

从这个例子中我们可以看到：Cache的设置总是静态的全局的，不像Fetch那样可以动态重写。

关于Python DB-API：fetchone与fetchmany与fetchall和python中fetch的问题就给大家分享到这里，感谢你花时间阅读本站内容，更多关于@OneToMany(fetch = FetchType.EAGER), FetchType.EAGER 急加载的导致查询重复明细数据的BUG、Ajax与Fetch、Cache Fetched AJAX Requests Locally: Wrapping the Fetch API、Cache与Fetch（二）等相关知识的信息别忘了在本站进行查找喔。

在本文中，我们将给您介绍关于何时使用 MongoDB 或其他面向文档的数据库系统？的详细内容，并且为您解答什么时候使用mongodb的相关问题，此外，我们还将为您提供关于11个面向文档的开源NoSQL数据库、CursusDB —— 面向文档的内存数据库、Genji —— 面向文档的嵌入式 SQL 数据库、mongodb 学习笔记之二 mongodb入门（数据库、文档和集合）的知识。

本文目录一览：

何时使用 MongoDB 或其他面向文档的数据库系统？（什么时候使用mongodb）
11个面向文档的开源NoSQL数据库
CursusDB —— 面向文档的内存数据库
Genji —— 面向文档的嵌入式 SQL 数据库
mongodb 学习笔记之二 mongodb入门（数据库、文档和集合）

何时使用 MongoDB 或其他面向文档的数据库系统？（什么时候使用mongodb）

我们为视频和音频剪辑、照片和矢量图提供平台。我们从 MySQL
作为数据库后端开始，最近包括MongoDB来存储文件的所有元信息，因为
MongoDB
更适合需求。例如：照片可能有Exif信息，视频可能有音轨，我们也想在其中存储元信息。视频和矢量图不共享任何共同的元信息等，所以我知道，MongoDB
非常适合存储这些非结构化数据并使其可搜索。

但是，我们会继续开发我们的平台并添加功能。现在接下来的步骤之一就是为我们的用户提供一个论坛。现在出现的问题是：使用 MySQL
数据库，这将是存储论坛和论坛帖子等的一个不错的选择，还是使用 MongoDB 呢？

所以问题是：何时使用 MongoDB，何时使用 RDBMS。如果您可以选择，您会选择 mongoDB 还是 MySQL，为什么要选择它？

答案1

小编典典

在NoSQL: If Only It Was That
Easy中，作者写到 MongoDB：

MongoDB 不是键/值存储，它更多。它也绝对不是 RDBMS。我没有在生产中使用过
MongoDB，但我已经用它构建了一个测试应用程序，它是一个非常酷的工具包。它似乎非常高效，并且已经或即将拥有容错和自动分片（也就是它会扩展）。我认为
Mongo 可能是迄今为止我所见过的最接近 RDBMS 替代品的东西。它适用于所有数据集和访问模式，但它适用于典型的 CRUD
内容。存储本质上是一个巨大的散列，并能够选择任何这些键，是大多数人使用关系数据库的目的。如果您的数据库是 3NF
并且您不进行任何连接（您只是选择一堆表并将所有对象放在一起，也就是大多数人在 Web 应用程序中所做的），MongoDB 可能会为您服务。

然后，在结论中：

真正要指出的是，如果您因为可以选择数据库而无法做出超级棒的东西，那么您做错了。 如果你知道mysql，就用它。在您真正需要时进行优化。像
ak/v 商店一样使用它，像 rdbms 一样使用它，但是看在上帝的份上，构建你的杀手级应用！这对大多数应用程序都无关紧要。Facebook
仍然大量使用 MySQL。Wikipedia 大量使用 MySQL。FriendFeed 经常使用 MySQL。 NoSQL
是一个很棒的工具，但它肯定不会成为你的竞争优势，它不会让你的应用程序变得热门，最重要的是，你的用户不会关心这些。
我将在什么基础上构建下一个应用程序？可能是Postgres。我会使用 NoSQL 吗？可能是。我也可能使用 Hadoop 和
Hive。我可能会将所有内容保存在平面文件中。也许我会开始在 Maglev 上进行黑客攻击。 我会使用最适合这份工作的任何东西。
如果我需要报告，我不会使用任何 NoSQL。 如果我需要缓存，我可能会使用 Tokyo Tyrant。如果我需要
ACIDity，我不会使用 NoSQL。如果我需要大量计数器，我会使用 Redis。 如果我需要交易，我会使用 Postgres。
如果我有大量单一类型的文档，我可能会使用 Mongo。 如果我需要每天写 10 亿个对象，我可能会使用
Voldemort。如果我需要全文搜索，我可能会使用 Solr。如果我需要对易失性数据进行全文搜索，我可能会使用 Sphinx。

我喜欢这篇文章，我觉得它信息量很大，它很好地概述了 NoSQL 的前景和炒作。但是，这是最重要的部分，在 RDBMS 和 NoSQL
之间进行选择时，问自己正确的问题确实很有帮助。值得一读恕我直言。

文章的替代链接

11个面向文档的开源NoSQL数据库

面向文档的数据库主要设计用来存储、获取以及管理基于文档的或者叫半结构化的数据，也属于 Nosql 数据库的一种类别。数据存储的最小单位是文档，同一个表中存储的文档属性可以是不同的，数据可以使用 JSON、XML 等多种格式存储。

1. MongoDB

MongoDB是一个介于关系数据库和非关系数据库之间的产品，是非关系数据库当中功能最丰富、最像关系数据库的。它支持的数据结构非常松散，类似json的bjson格式，因此可以存储比较复杂的数据类型。Mongo最大的特点是支持的查询语言非常强大，其语法有点类似于面向对象的查询语言，几乎可以实现类似关系数据库单表查询的绝大部分功能，而且还支持对数据建立索引。

项目地址: http://www.mongodb.org/
入门指南: http://www.mongodb.org/display/DOCS/Quickstart
下载: http://www.mongodb.org/downloads

2. Apache CouchDB

Apache CouchDB是一个面向文档的数据库管理系统。它提供以 JSON 作为数据格式的 REST 接口来对其进行操作，并可以通过视图来操纵文档的组织和呈现。 CouchDB 是 Apache 基金会的顶级开源项目。

与现在流行的关系数据库服务器不同，CouchDB 是围绕一系列语义上自包含的文档而组织的。 CouchDB 中的文档是没有模式的（schema free），也就是说并不要求文档具有某种特定的结构。 CouchDB 的这种特性使得相对于传统的关系数据库而言，有自己的适用范围。 CouchDB 对于很多应用来说，提供了关系数据库之外的更好的选择。

项目地址: http://couchdb.apache.org/
入门指南: http://couchdb.apache.org/docs/intro.html
下载: http://couchdb.apache.org/downloads.html

3. Terrastore

Terrastore是一个基于Terracotta（一个业界公认的、快速的分布式集群组件）实现的高性能分布式文档数据库。可以动态从运行中的集群添加/删除节点，而且不需要停机和修改任何配置。支持通过http协议访问Terrastore。Terrastore提供了一个基于集合的键/值接口来管理JSON文档并且不需要预先定义JSON文档的架构。易于操作，安装一个完整能够运行的集群只需几行命令。

项目地址: http://code.google.com/p/terrastore/
入门指南: http://code.google.com/p/terrastore/wiki/Documentation
下载: http://code.google.com/p/terrastore/downloads/list

4. RavendB

RavendB是个新的.NET、支持Linq的开源文档数据库，旨在Window平台下提供一个高性，结构简单、灵活，可扩展Nosql存储。Raven将JSON文档存在数据库中。可以使用C#的Linq语法查询数据。

项目地址: http://ravendb.net/
入门指南: http://ravendb.net/tutorials
下载: http://ravendb.net/download

5. OrientDB

OrientDB是兼具文挡数据库的灵活性和图形数据库管理链接能力的可深层次扩展的文档-图形数据库管理系统。可选择无模式、全模式或混合模式下工作。支持许多高级特性，诸如ACID事务、快速索引、原生和SQL查询功能。可以JSON格式导入、导出文档。若不执行昂贵的JOIN操作的话，如同关系数据库可在几毫秒内可检索数以百计的链接文档图。

项目地址: http://www.orientechnologies.com/
入门指南: http://code.google.com/p/orient/wiki/Tutorials
下载: http://code.google.com/p/orient/wiki/download

6. ThruDB

ThruDB是一套简单的服务建立在Apache的Thrift的框架，提供索引和文件存储服务的网站建设和推广。其目的是提供Web开发灵活、快速和易于使用的服务，可以加强或取代传统的数据存储和访问层。

项目地址: http://code.google.com/p/thrudb/
入门指南: http://thrudb.googlecode.com/svn/trunk/doc/Thrudb.pdf
下载: http://code.google.com/p/thrudb/source/checkout

7. SisoDB

SisoDB是一个为 sql Server 编写的面向文档的 db-provider，使用 C# 编写，可让你直接在数据库中存储对象。

项目地址: http://www.sisodb.com
入门指南: http://www.sisodb.com/Wiki
下载: https://github.com/danielwertheim/SisoDb-Provider/

8. RaptorDB

RaptorDB是一个很小的、快速的嵌入式 Nosql 存储模块，使用 B+ 树或者 MurMur 哈希索引。支持数据持久化到磁盘中存储。

项目地址: http://www.codeproject.com/KB/database/RaptorDB.aspx
入门指南: http://www.codeproject.com/KB/database/RaptorDB.aspx
下载: http://www.codeproject.com/KB/database/RaptorDB.aspx

9. CloudKit

CloudKit提供了一个结构灵活、自动版本化、RESTful JSON存储，可选支持OpenID 和 OAuth，包括OAuth discovery。

项目地址: http://getcloudkit.com/
入门指南: http://getcloudkit.com/api/
下载: https://github.com/jcrosby/cloudkit

10. Perservere

Perservere是一个开源的工具集用于持久化和分布式计算，使用一个直观基于标准的HTTP REST、JSON-RPC、JSONPath和REST Channels的JSON接口。Persevere服务器包括了一个Persevere JavaScript客户端,但是其标准的接口其实支持任何框架或客户端使用。

项目地址: http://code.google.com/p/persevere-framework/
入门指南: http://code.google.com/p/persevere-framework/w/list
下载: http://code.google.com/p/persevere-framework/downloads/list

11. Apache Jackrabbit

Apache Jackrabbit是由 Apache Foundation 提供的 JSR-170 的开放源码实现。

随着内容管理应用程序的日益普及，对用于内容仓库的普通、标准化 API 的需求已凸现出来。Content Repository for Java Technology API （JSR-170）的目标就是提供这样一个接口。JSR-170 的一个主要优点是，它不绑定到任何特定的底层架构。例如，JSR-170 实现的后端数据存储可以是文件系统、WebDAV 仓库、支持 XML 的系统，甚至还可以是 sql 数据库。此外，JSR-170 的导出和导入功能允许一个集成器在内容后端与 JCR 实现之间无缝地切换。

项目地址: http://jackrabbit.apache.org
入门指南: http://jackrabbit.apache.org/getting-started-with-apache-jackrabbit.html
下载: http://jackrabbit.apache.org/downloads.html

英文原文：http://orangeslate.com/2011/12/06/11-open-document-oriented-databases-which-comes-under-nosql-db-category/

CursusDB —— 面向文档的内存数据库

【直播预告】程序员逆袭 CEO 分几步？

CursusDB 是一种面向文档的快速开源内存数据库，提供安全性、持久性、分布性、可用性和类似 SQL 的查询语言。

Genji —— 面向文档的嵌入式 SQL 数据库

Genji 是一个用 Go 编写的嵌入式数据库，旨在简化现代世界中的数据处理。它将 SQL 的强大功能与文档的多功能性相结合，以提供最大的灵活性而不妥协。

mongodb 学习笔记之二 mongodb入门（数据库、文档和集合）

mongodb 基本概念： 1、文档是Mongodb中数据的基本单元，类于关系型数据库中的行。（但比行要复杂的多） 2、集合可以看出是没有字段属性的表。 3、Mongodb 的单个实列可以包含独立的多个数据库，每一个都有自己的集合和权限。 4、Mongodb 自带简洁但功能强大

mongodb 基本概念：

1、文档是Mongodb中数据的基本单元，类似于关系型数据库中的行。（但比行要复杂的多）

2、集合可以看出是没有字段属性的表。

3、Mongodb 的单个实列可以包含独立的多个数据库，每一个都有自己的集合和权限。

4、Mongodb 自带简洁但功能强大的javasrcipt shell ,这个工具对于管理Mongodb实列和操作数据非常有用。

5、每一个文档都有一个特殊的键“_id”它在文档所处的集合是唯一。

文档

文档是MongoDB的核心概念.多个键及其关联的值有序的放置在一起便是文档.

大多数语言都有想通的一种数据结构,比如:映射,散列或字典.在javascript里面,文档表示为对象:

{"greeting":"Hello world","age":30}

这个文档只有一个键"greeting",其对应的值为"Hello world".绝大数情况下,文档会比这复杂的多,

经常会包含多个键值对:

{"greeting":"Hello world","Hello":"Refactor"}

文档中的键值对是有序的,上面的文档与下面的文档是不同的:

{"Hello":"Refactor","greeting":"Hello world"}

文档中的值可以是字符串,也可以是其他几种数据类型.如例子中的"age"的值是整数

文档中的键是字符串,除了少数例外的情况下,键可以是任意utf-8字符

键不能含有\0(空字符),这个字符表示键的结尾

.和$只有在特定的环境下才能使用,使用不当的话,驱动程序会提示

下划线"_"开头的键是保留的,虽然这个并不是严格要求的

MongoDB不但区分类型,也区分大小写,下面两个文档是不同的:

{"age":"30"}

{"age":30}

以下文档也是不同的:

{"age":30}

{"Age":30}

MongoDB文档不能有重复键

{"Hello":"Hello world","Hello":"Refactor"}这是不正确的

如果我们会json，那么bson我们就已经掌握了一半了，至于新添加的数据类型后面我会介绍。

文档例子如下：

{ site : "w3cschool.cc" }

通常，"object（对象）" 术语是指一个文件。

文件类似于一个RDBMS的记录。

我们可以对集合（collection）进行插入，更新和删除操作。

下表将帮助您更容易理解Mongo中的一些概念：

RDBMS	MongoDB
Table（表）	Collection（集合）
Column（栏）	Key（键）
Value（值）	Value（值）
Records / Rows（记录/列）	Document / Object（文档/对象）

下表为MongoDB中常用的几种数据类型。

数据类型	描述
string（字符串）	可以是一个空字符串或者字符组合。
integer（整型）	整数。
boolean（布尔型）	逻辑值 True 或者 False。
double	双精度浮点型
null	不是0，也不是空。
array	数组：一系列值
object	对象型，程序中被使用的实体。可以是一个值，变量，函数，或者数据结构。
timestamp	timestamp存储为64为的值，只运行一个mongod时可以确保是唯一的。前32位保存的是UTC时间，单位是秒，后32为是在这一秒内的计数值，从0开始，每新建一个MongoTimestamp对象就加一。
Internationalized Strings	UTF-8 字符串。
Object IDs	在mongodb中的文档需要使用唯一的关键字_id来标识他们。几乎每一个mongodb文档都使用_id字段作为第一个属性（在系统集合和定容量集合（capped collection）中有一些例外）。_id值可以是任何类型，最常见的做法是使用ObjectId类型。

集合

集合是一组文档,如果说文档相当于关系型数据库中的行,那么集合相当于表

集合是无模式的,这意味着一个集合里面的文档可以是各种各样的,下面两个文档可以存在同一个集合中:

{"Hello":"Refactor"}

{"Age":30}

注意,上面的文档不光是值的类型不同(字符串和整数),他们的键也是不一样的.因为集合里面可以放置任何文档,

那么就有一个问题:还有必要使用多个集合吗?要是没必要对各种文档划分模式,那么为什么还要使用多个结合呢?

理由如下:

1.把各种各样的文档都混在一个集合里面,开发者要么确保每次查询只返回需要的文档种类,要么让执行查询的

应用程序来处理所有不同类型的文档.如:查询博客文章,还要剔除那么包含有作者数据的文档

2.在一个集合里面查询特定类型的文档在速度上不划算,分开做多个集合要快的多.如:集合里面有个标注类型的键

要查询其值为"Refactor1","Refactor2"或"Refactor3"的文档,就会很慢,如果按照名字分割成3个集合的话,查询会

快的多(参见"子集合")

3.把同种类型的文档放在一个集合里,这样数据很集中.从只含有博客文章的集合里面查询几篇文章,会比从含有文章

和作者数据的集合里面查几篇文章少消耗磁盘寻道操作.

4.当创建索引的时候,文档会有附加的结构(尤其是有唯一索引的时候).索引是按照集合来定义的.把同种类型的文档

放在同一个集合里面.使索引更有效.

合法的集合名

集合名称必须以字母或下划线开头。

集合名可以保护数字

集合名称不能使美元符"$"，"$"是系统保留字符。

集合的名字最大不能超过128个字符。

另外，"."号的使用在集合当中是允许的，它们被成为子集合(Subcollection)；比如你有一个blog集合，你可以使用blog.title，blog.content或者blog.author来帮组你更好地组织集合。

如下实例：

db.tutorials.php.findOne()

capped collections

Capped collections 就是固定大小的collection。

它有很高的性能以及队列过期的特性(过期按照插入的顺序). 有点和 "RRD" 概念类似。

Capped collections是高性能自动的维护对象的插入顺序。它非常适合类似记录日志的功能和标准的collection不同，你必须要显式的创建一个capped collection，指定一个collection的大小，单位是字节。collection的数据存储空间值提前分配的。

要注意的是指定的存储大小包含了数据库的头信息。

> db.createCollection("mycoll", {capped:true, size:100000})

在capped collection中，你能添加新的对象。
能进行更新，然而，对象不会增加存储空间。如果增加，更新就会失败。
数据库不允许进行删除。使用drop()方法删除collection所有的行。
注意: 删除之后，你必须显式的重新创建这个collection。
在32bit机器中，capped collection最大存储为1e9( 1X109)个字节。

子集合

组织集合的一种惯例是使用"."字符分开的按命名空间划分的子集合.如:一个带有博客功能的应用可能包含两个集合,

分别是blog.posts和blog.authors.这样做的目的是为了使组织结构更好些,也就是说blog这个集合(可能根本就不存在)

及其子集合没有任何关系.

很多MongoDB工具中都包含子集合

1.GridFS是一种存储大文件的协议,使用子集合来存储文件的元数据,这样就与内容块分开了

2.MongoDB的web控制台通过子集合的方式将数据组织在DBTOP部分

3.数据库shell里面,db.blog代表blog集合,db.blog.posts代表blog.posts集合

在MongoDB中使用子集合是组织数据的最好方法.

数据库

MongoDB中多个文档组成集合,同样多个集合组成数据库.一个MongoDB实例可以有多个数据库,

它们之间可视为完全独立的.每个数据库都有独立的权限控制,即便是在磁盘上,不同的数据库也放置

在不同的文件中.将一个应用的所有数据存储在同一个数据库中.

和集合一样,数据库也通过名字来标识,数据库名必须满足如下条件的utf-8字符:

1.不能是空字符串("")

2.不能含有''''(空格),.,$,/,\和\0(空字符)

3.应全部小写

4.最多64字节

之所以有这么限制,是因为数据库名最终会变成文件系统里的文件.

有些数据库名是保留的,可以直接访问这些有特殊作用的数据库,如:

1.admin

从权限的角度看,这是"root"数据库.要是将一个用户添加到这个数据库,这个用户自动继承所有数据库的权限.

一些特定的服务器端命令也只能从这个数据库运行,如:列出所有的数据库或者关闭服务器

2.local

这个数据库不会被复制,可以用来存储限于本地单台服务器的任意集合

3.config

当MongoDB用于分片设置时,config数据库在内部使用,用于保存分片的相关信息.

重点注意：

把数据库的名字放在集合名前,得到就是集合的完全限定名,称为命名空间.如:如果在cms数据库中

使用blog.posts集合,那么这个集合的命名空间就是cms.blog.posts.命名空间不得超过121字节,

在实际应用中应该小于100字节.

关于何时使用 MongoDB 或其他面向文档的数据库系统？和什么时候使用mongodb的介绍已经告一段落，感谢您的耐心阅读，如果想了解更多关于11个面向文档的开源NoSQL数据库、CursusDB —— 面向文档的内存数据库、Genji —— 面向文档的嵌入式 SQL 数据库、mongodb 学习笔记之二 mongodb入门（数据库、文档和集合）的相关信息，请在本站寻找。

本文标签：